TY - JOUR
T1 - Effectiveness of shrinkage and variable selection methods for the prediction of complex human traits using data from distantly related individuals
AU - Berger, Swetlana
AU - Pérez-Rodríguez, Paulino
AU - Veturi, Yogasudha
AU - Simianer, Henner
AU - de los Campos, Gustavo
N1 - Publisher Copyright:
© 2015 The Authors. Annals of Human Genetics published by University College London (UCL) and John Wiley & Sons Ltd.
PY - 2015/3/1
Y1 - 2015/3/1
N2 - Genome-wide association studies (GWAS) have detected large numbers of variants associated with complex human traits and diseases. However, the proportion of variance explained by GWAS-significant single nucleotide polymorphisms has been usually small. This brought interest in the use of whole-genome regression (WGR) methods. However, there has been limited research on the factors that affect prediction accuracy (PA) of WGRs when applied to human data of distantly related individuals. Here, we examine, using real human genotypes and simulated phenotypes, how trait complexity, marker-quantitative trait loci (QTL) linkage disequilibrium (LD), and the model used affect the performance of WGRs. Our results indicated that the estimated rate of missing heritability is dependent on the extent of marker-QTL LD. However, this parameter was not greatly affected by trait complexity. Regarding PA our results indicated that: (a) under perfect marker-QTL LD WGR can achieve moderately high prediction accuracy, and with simple genetic architectures variable selection methods outperform shrinkage procedures and (b) under imperfect marker-QTL LD, variable selection methods can achieved reasonably good PA with simple or moderately complex genetic architectures; however, the PA of these methods deteriorated as trait complexity increases and with highly complex traits variable selection and shrinkage methods both performed poorly. This was confirmed with an analysis of human height.
AB - Genome-wide association studies (GWAS) have detected large numbers of variants associated with complex human traits and diseases. However, the proportion of variance explained by GWAS-significant single nucleotide polymorphisms has been usually small. This brought interest in the use of whole-genome regression (WGR) methods. However, there has been limited research on the factors that affect prediction accuracy (PA) of WGRs when applied to human data of distantly related individuals. Here, we examine, using real human genotypes and simulated phenotypes, how trait complexity, marker-quantitative trait loci (QTL) linkage disequilibrium (LD), and the model used affect the performance of WGRs. Our results indicated that the estimated rate of missing heritability is dependent on the extent of marker-QTL LD. However, this parameter was not greatly affected by trait complexity. Regarding PA our results indicated that: (a) under perfect marker-QTL LD WGR can achieve moderately high prediction accuracy, and with simple genetic architectures variable selection methods outperform shrinkage procedures and (b) under imperfect marker-QTL LD, variable selection methods can achieved reasonably good PA with simple or moderately complex genetic architectures; however, the PA of these methods deteriorated as trait complexity increases and with highly complex traits variable selection and shrinkage methods both performed poorly. This was confirmed with an analysis of human height.
UR - http://www.scopus.com/inward/record.url?scp=84923257671&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84923257671&partnerID=8YFLogxK
U2 - 10.1111/ahg.12099
DO - 10.1111/ahg.12099
M3 - Article
C2 - 25600682
AN - SCOPUS:84923257671
SN - 0003-4800
VL - 79
SP - 122
EP - 135
JO - Annals of Human Genetics
JF - Annals of Human Genetics
IS - 2
ER -