Project Details


Rare variant has been shown to influence complex human diseases with significant relevance to public health, including type II diabetes related glycemic traits, coronary artery diseases, and age-related macular degeneration. Most of the rare variants involved in complex human diseases have moderate effects, which makes it necessary to analyze large sample sizes. Technology advances such as the use of social media and consumer directed genetics have greatly empowered researchers to quickly recruit study participants with interesting phenotypes. Thanks to the decreasing cost of sequencing and microarray genotyping, there is an unprecedented opportunity to assess the impact of rare variants in these ever-growing reservoir of sequenced/genotyped samples, and understand the genetic architecture of rare variants. Meta-analyses have been a powerful tool to aggregate genotype-phenotype association information from multiple cohorts. Compared to methods that require pooling individual level data, meta-analyses better protect study participant privacy, more robust against heterogeneity between studies, and offer equal power for detecting associations. In many settings, meta-analysis is the only potential solution where sharing individual level information is impossible. In the sequencing age, meta-analyses warrant additional development, in order to accommodate the much increased scale of the datasets, enable more accurate assessment of statistical significance for analyzing low frequency variants, and allow for more robust association analyses. In Aim 1, we will develop novel methods to enable more accurate meta-analyses of low frequency variants extending ideas from Firth and Bartlett correction. In Aim 2, we will develop more scalable methods to meta-analyze low frequency variants, borrowing strength from large reference panels e.g. from the Haplotype Reference Consortium. In Aim 3, we will develop methods to accommodate sequence data heterogeneities and enable more robust meta-analyses. In Aim 4, we will develop methods that enable the global assessment of genetic architectures, allowing more accurate enrichment analyses and tissue specific analyses. For all methods arising from this proposal, we will provide useful softwares implementing these methods, continuing our strong track record in this direction (Aim 5). To achieve our research goals, we assembled a strong research team, consisting of not only method developers, but also geneticists leading big, high profile studies. Methods and tools from this proposal will be applied to some of the largest datasets in the world for studying nicotine and alcohol dependence, lipid levels, heart disease and macular degenerations.
Effective start/end date5/16/164/30/21


  • National Human Genome Research Institute: $386,289.00
  • National Human Genome Research Institute: $386,300.00
  • National Human Genome Research Institute: $399,909.00
  • National Human Genome Research Institute: $386,278.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.