Statistics plays a critical role in drawing and even defining conclusions from scientific experiments across various fields. In today’s era of rapid data growth, statistics faces the challenge of developing scalable, robust, and interpretable methods for complex data environment. My primary research focuses on developing statistical methodologies and theories, particularly within Bayesian statistics, to provide robust statistical inference to un- cover true underlying data generating process, especially in high-dimensional and complex data. Much of my methodological research is applied to genetics, multi-omics, and genetic epidemiology.
My career goal is to develop innovative statistical methods that address challenges in human genomics and genetics, further advance Bayesian statistical theory, and collaborate with colleagues to apply these methods in the study of complex human diseases. My Ph.D. dissertation, which focused on discrete and continuous model selection in high-dimensional regime, serves as the foundation of my current research. This work focuses on identifying causal effects of genetic variants, either through methodological innovation or applied analysis, in genetic and genetic epidemiology studies.
Statistical Genetics
The first thread of my research focuses on developing statistical genetic methods, such as statistical fine- mapping and genome-wide association studies (GWAS), to pinpoint causal variants and genes associated with complex human traits. These methods have greatly advanced genetic data accessibility, integration, and statistical decision-making, leading to novel discoveries about gene-trait relationships and a comprehensive understanding of the genetic and environmental factors influencing these traits.
Bayesian Model Selection
The second thread of my research focuses on advancing Bayesian statistical theory, particularly in high- dimensional model selection. My work demonstrates the model selection consistency theory of applying the mixture of g-prior in scenarios with growing true model. I also proposed the innovative Heavy-tailed Horseshoe prior, establishing its theoretical properties, such as asymptotically minimax risk in L2 norm.
Genetic Epidemiology
The third thread of my research focuses on applying statistical genetics methods to genetic epidemiology studies of Alzheimer’s disease (AD) in multi-ethnic and admixed populations, aiming to uncover genetic and environmental contributors to AD risk and progression across diverse groups. My recent work includes con- ducting GWAS on the largest Hispanic cohort and establishing the largest multi-ethnic brain transcriptomic dataset for identifying differentially expressed genes in AD.