Our research is driven by the question: How do we optimally apply computational statistics and machine learning to big biological data to produce actionable insights and predictions?
In practice, our work involves core research projects, management and analysis of big biological data, and development of new analysis methods and associated algorithms. A major component of our work involves data wrangling: the processing of data sets to analyzable forms. The data we work with include genomic and other high-throughput molecular data - spanning genomic, epigenomic, metabolic, and proteomic data from cell-free to single-cell to spatial, collected by next-generation sequencing (NGS), array, and mass spectrometry platforms - medical and disease imaging data - spanning digital pathology to magnetic resonance imaging (MRI) - and clinical health data - spanning clinical trial electronic data capture (EDC), electronic health record (EHR), and real world evidence (RWE) data. We also regularly develop computational statistics and machine learning methods for specific applications, where our output includes theory - deriving theorems that impact our understanding of the potential of computational analysis methodology and the properties of algorithms, computational statistics - spanning development of regularized / penalized generalized linear mixed models (GLMM), hierarchical mixture prior Bayesian models, and the simpler forms of these models, machine learning - spanning development of non-linear dimension reduction and clustering, regression trees and random forests, support vector machines (SVMs), probabilistic graphic models, and convolutional neural networks (CNNs) and algorithms - including Expectation-Maximization (EM), variational Bayes, and Markov chain Monte Carlo (MCMC).
Our core research projects are mostly within four major research areas:
Our current projects include development of phylogenetic methods for microbiome analysis, applying human pedigree information to improve polygenic risk scores, epistatic analysis of complex diseases, development of multi-omics cancer detection diagnostics from cell-free assays, single-cell analysis of the impacts of gene therapies, development of medical image biomarkers of disease severity, and mining electronic health records for predictors of drug responses.
Please see our publications or contact us for more information on our current work.
Copyright © 2024 MezeyLab - All Rights Reserved.