TY - JOUR
T1 - Population analysis of large copy number variants and hotspots of human genetic disease
AU - Itsara, Andy
AU - Cooper, Gregory M.
AU - Baker, Carl
AU - Girirajan, Santhosh
AU - Li, Jun
AU - Absher, Devin
AU - Krauss, Ronald M.
AU - Myers, Richard M.
AU - Ridker, Paul M.
AU - Chasman, Daniel I.
AU - Mefford, Heather
AU - Ying, Phyllis
AU - Nickerson, Deborah A.
AU - Eichler, Eva E.
N1 - Funding Information:
We would like to thank A. Singleton for sharing genotype data, generated with support from the Intramural Research Program of the National Institute on Aging, National Institutes of Health, Department of Health and Human Services (Z01-AG000932-01). A.I. is supported by the National Human Genome Research Institute Training Grant (T32 HG00035). G.M.C. is supported by a Merck, Jane Coffin Childs Fellowship P.M.R. has received research support relevant to the content of this manuscript from the National Heart, Lung, and Blood Institute, the National Cancer Institute, the Donald W. Reynolds Foundation, Roche Diagnostics, and Amgen, Inc. D.I.C. acknowledges support from the Donald W. Reynolds Foundation and the National Institutes of Health. The PARC project is supported by the National Heart, Lung, and Blood Institute (HL01069757). E.E.E. acknowledges the support of the National Institutes of Health (HD043569, HG004120) and is an investigator of the Howard Hughes Medical Institute. The authors have no conflicts of interest to declare.
PY - 2008/8/8
Y1 - 2008/8/8
N2 - Copy number variants (CNVs) contribute to human genetic and phenotypic diversity. However, the distribution of larger CNVs in the general population remains largely unexplored. We identify large variants in ∼2500 individuals by using Illumina SNP data, with an emphasis on "hotspots" prone to recurrent mutations. We find variants larger than 5% kb in 5%-10% of individuals and variants greater than 1 Mb in 1%-2%. In contrast to previous studies, we find limited evidence for stratification of CNVs in geographically distinct human populations. Importantly, our sample size permits a robust distinction between truly rare and polymorphic but low-frequency copy number variation. We find that a significant fraction of individual CNVs larger than 100 kb are rare and that both gene density and size are strongly anticorrelated with allele frequency. Thus, although large CNVs commonly exist in normal individuals, which suggests that size alone can not be used as a predictor of pathogenicity, such variation is generally deleterious. Considering these observations, we combine our data with published CNVs from more than 12,000 individuals contrasting control and neurological disease collections. This analysis identifies known disease loci and highlights additional CNVs (e.g., 3q29, 16p12, and 15q25.2) for further investigation. This study provides one of the first analyses of large, rare (0.1%-1%) CNVs in the general population, with insights relevant to future analyses of genetic disease.
AB - Copy number variants (CNVs) contribute to human genetic and phenotypic diversity. However, the distribution of larger CNVs in the general population remains largely unexplored. We identify large variants in ∼2500 individuals by using Illumina SNP data, with an emphasis on "hotspots" prone to recurrent mutations. We find variants larger than 5% kb in 5%-10% of individuals and variants greater than 1 Mb in 1%-2%. In contrast to previous studies, we find limited evidence for stratification of CNVs in geographically distinct human populations. Importantly, our sample size permits a robust distinction between truly rare and polymorphic but low-frequency copy number variation. We find that a significant fraction of individual CNVs larger than 100 kb are rare and that both gene density and size are strongly anticorrelated with allele frequency. Thus, although large CNVs commonly exist in normal individuals, which suggests that size alone can not be used as a predictor of pathogenicity, such variation is generally deleterious. Considering these observations, we combine our data with published CNVs from more than 12,000 individuals contrasting control and neurological disease collections. This analysis identifies known disease loci and highlights additional CNVs (e.g., 3q29, 16p12, and 15q25.2) for further investigation. This study provides one of the first analyses of large, rare (0.1%-1%) CNVs in the general population, with insights relevant to future analyses of genetic disease.
UR - http://www.scopus.com/inward/record.url?scp=62649088108&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=62649088108&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2008.12.014
DO - 10.1016/j.ajhg.2008.12.014
M3 - Article
C2 - 19166990
AN - SCOPUS:62649088108
SN - 0002-9297
VL - 84
SP - 148
EP - 161
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
IS - 2
ER -