TY - JOUR
T1 - Genetic Diversity and Association Studies in US Hispanic/Latino Populations
T2 - Applications in the Hispanic Community Health Study/Study of Latinos
AU - Conomos, Matthew P.
AU - Laurie, Cecelia A.
AU - Stilp, Adrienne M.
AU - Gogarten, Stephanie M.
AU - McHugh, Caitlin P.
AU - Nelson, Sarah C.
AU - Sofer, Tamar
AU - Fernández-Rhodes, Lindsay
AU - Justice, Anne E.
AU - Graff, Mariaelisa
AU - Young, Kristin L.
AU - Seyerle, Amanda A.
AU - Avery, Christy L.
AU - Taylor, Kent D.
AU - Rotter, Jerome I.
AU - Talavera, Gregory A.
AU - Daviglus, Martha L.
AU - Wassertheil-Smoller, Sylvia
AU - Schneiderman, Neil
AU - Heiss, Gerardo
AU - Kaplan, Robert C.
AU - Franceschini, Nora
AU - Reiner, Alex P.
AU - Shaffer, John R.
AU - Barr, R. Graham
AU - Kerr, Kathleen F.
AU - Browning, Sharon R.
AU - Browning, Brian L.
AU - Weir, Bruce S.
AU - Avilés-Santa, M. Larissa
AU - Papanicolaou, George J.
AU - Lumley, Thomas
AU - Szpiro, Adam A.
AU - North, Kari E.
AU - Rice, Ken
AU - Thornton, Timothy A.
AU - Laurie, Cathy C.
N1 - Publisher Copyright:
© 2016 The American Society of Human Genetics.
PY - 2016/1/7
Y1 - 2016/1/7
N2 - US Hispanic/Latino individuals are diverse in genetic ancestry, culture, and environmental exposures. Here, we characterized and controlled for this diversity in genome-wide association studies (GWASs) for the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). We simultaneously estimated population-structure principal components (PCs) robust to familial relatedness and pairwise kinship coefficients (KCs) robust to population structure, admixture, and Hardy-Weinberg departures. The PCs revealed substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). To control for variation among groups, we developed a multi-dimensional clustering method to define a "genetic-analysis group" variable that retains many properties of self-identified background while achieving substantially greater genetic homogeneity within groups and including participants with non-specific self-identification. In GWASs of 22 biomedical traits, we used a linear mixed model (LMM) including pairwise empirical KCs to account for familial relatedness, PCs for ancestry, and genetic-analysis groups for additional group-associated effects. Including the genetic-analysis group as a covariate accounted for significant trait variation in 8 of 22 traits, even after we fit 20 PCs. Additionally, genetic-analysis groups had significant heterogeneity of residual variance for 20 of 22 traits, and modeling this heteroscedasticity within the LMM reduced genomic inflation for 19 traits. Furthermore, fitting an LMM that utilized a genetic-analysis group rather than a self-identified background group achieved higher power to detect previously reported associations. We expect that the methods applied here will be useful in other studies with multiple ethnic groups, admixture, and relatedness.
AB - US Hispanic/Latino individuals are diverse in genetic ancestry, culture, and environmental exposures. Here, we characterized and controlled for this diversity in genome-wide association studies (GWASs) for the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). We simultaneously estimated population-structure principal components (PCs) robust to familial relatedness and pairwise kinship coefficients (KCs) robust to population structure, admixture, and Hardy-Weinberg departures. The PCs revealed substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). To control for variation among groups, we developed a multi-dimensional clustering method to define a "genetic-analysis group" variable that retains many properties of self-identified background while achieving substantially greater genetic homogeneity within groups and including participants with non-specific self-identification. In GWASs of 22 biomedical traits, we used a linear mixed model (LMM) including pairwise empirical KCs to account for familial relatedness, PCs for ancestry, and genetic-analysis groups for additional group-associated effects. Including the genetic-analysis group as a covariate accounted for significant trait variation in 8 of 22 traits, even after we fit 20 PCs. Additionally, genetic-analysis groups had significant heterogeneity of residual variance for 20 of 22 traits, and modeling this heteroscedasticity within the LMM reduced genomic inflation for 19 traits. Furthermore, fitting an LMM that utilized a genetic-analysis group rather than a self-identified background group achieved higher power to detect previously reported associations. We expect that the methods applied here will be useful in other studies with multiple ethnic groups, admixture, and relatedness.
UR - http://www.scopus.com/inward/record.url?scp=84954289580&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84954289580&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2015.12.001
DO - 10.1016/j.ajhg.2015.12.001
M3 - Article
C2 - 26748518
AN - SCOPUS:84954289580
SN - 0002-9297
VL - 98
SP - 165
EP - 184
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
IS - 1
ER -