TY - JOUR
T1 - Machine learning for detecting gene-gene interactions
T2 - A review
AU - McKinney, Brett A.
AU - Reif, David M.
AU - Ritchie, Marylyn D.
AU - Moore, Jason H.
N1 - Funding Information:
This work was supported by National Institutes of Health (NIH) grants AI059694, LM009012, AI057661, AI064625, HL65234, RR018787, ES007373 and HD047447. This work was also supported by generous funds from the Vanderbilt Program in Biomathematics and the Norris-Cotton Cancer Center at Dartmouth Medical School.
PY - 2006
Y1 - 2006
N2 - Complex interactions among genes and environmental factors are known to play a role in common human disease aetiology. There is a growing body of evidence to suggest that complex interactions are 'the norm' and, rather than amounting to a small perturbation to classical Mendelian genetics, interactions may be the predominant effect. Traditional statistical methods are not well suited for detecting such interactions, especially when the data are high dimensional (many attributes or independent variables) or when interactions occur between more than two polymorphisms. In this review, we discuss machine-learning models and algorithms for identifying and characterising susceptibility genes in common, complex, multifactorial human diseases. We focus on the following machine-learning methods that have been used to detect gene-gene interactions: neural networks, cellular automata, random forests, and multifactor dimensionality reduction. We conclude with some ideas about how these methods and others can be integrated into a comprehensive and flexible framework for data mining and knowledge discovery in human genetics.
AB - Complex interactions among genes and environmental factors are known to play a role in common human disease aetiology. There is a growing body of evidence to suggest that complex interactions are 'the norm' and, rather than amounting to a small perturbation to classical Mendelian genetics, interactions may be the predominant effect. Traditional statistical methods are not well suited for detecting such interactions, especially when the data are high dimensional (many attributes or independent variables) or when interactions occur between more than two polymorphisms. In this review, we discuss machine-learning models and algorithms for identifying and characterising susceptibility genes in common, complex, multifactorial human diseases. We focus on the following machine-learning methods that have been used to detect gene-gene interactions: neural networks, cellular automata, random forests, and multifactor dimensionality reduction. We conclude with some ideas about how these methods and others can be integrated into a comprehensive and flexible framework for data mining and knowledge discovery in human genetics.
UR - http://www.scopus.com/inward/record.url?scp=33744937046&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33744937046&partnerID=8YFLogxK
U2 - 10.2165/00822942-200605020-00002
DO - 10.2165/00822942-200605020-00002
M3 - Review article
C2 - 16722772
AN - SCOPUS:33744937046
SN - 1175-5636
VL - 5
SP - 77
EP - 88
JO - Applied Bioinformatics
JF - Applied Bioinformatics
IS - 2
ER -