The explosion of genetic information over the last decade presents an analytical challenge for genetic association studies. As the number of genetic variables examined per individual increases, both variable selection and statistical modeling tasks must be performed during analysis. While these tasks could be performed separately, coupling them is necessary to select meaningful variables that effectively model the data. This challenge is heightened due to the complex nature of the phenotypes under study and the complex underlying genetic etiologies. To address this problem, a number of novel methods have been developed. In the current study, we compare the performance of six analytical approaches to detect both main effects and gene-gene interactions in a range of genetic models. Multifactor dimensionality reduction, grammatical evolution neural networks, random forests, focused interaction testing framework, step-wise logistic regression, and explicit logistic regression were compared. As one might expect, the relative success of each method is context dependent. This study demonstrates the strengths and weaknesses of each method and illustrates the importance of continued methods development.
All Science Journal Classification (ASJC) codes