TY - JOUR
T1 - Analysis pipeline for the epistasis search - statistical versus biological filtering
AU - Sun, Xiangqing
AU - Lu, Qing
AU - Mukheerjee, Shubhabrata
AU - Crane, Paul K.
AU - Elston, Robert
AU - Ritchie, Marylyn D.
PY - 2014
Y1 - 2014
N2 - Gene-gene interactions may contribute to the genetic variation underlying complex traits but have not always been taken fully into account. Statistical analyses that consider gene-gene interaction may increase the power of detecting associations, especially for low-marginal-effect markers, and may explain in part the "missing heritability." Detecting pair-wise and higher-order interactions genome-wide requires enormous computational power. Filtering pipelines increase the computational speed by limiting the number of tests performed. We summarize existing filtering approaches to detect epistasis, after distinguishing the purposes that lead us to search for epistasis. Statistical filtering includes quality control on the basis of single marker statistics to avoid the analysis of bad and least informative data, and limits the search space for finding interactions. Biological filtering includes targeting specific pathways, integrating various databases based on known biological and metabolic pathways, gene function ontology and protein-protein interactions. It is increasingly possible to target single-nucleotide polymorphisms that have defined functions on gene expression, though not belonging to protein-coding genes. Filtering can improve the power of an interaction association study, but also increases the chance of missing important findings.
AB - Gene-gene interactions may contribute to the genetic variation underlying complex traits but have not always been taken fully into account. Statistical analyses that consider gene-gene interaction may increase the power of detecting associations, especially for low-marginal-effect markers, and may explain in part the "missing heritability." Detecting pair-wise and higher-order interactions genome-wide requires enormous computational power. Filtering pipelines increase the computational speed by limiting the number of tests performed. We summarize existing filtering approaches to detect epistasis, after distinguishing the purposes that lead us to search for epistasis. Statistical filtering includes quality control on the basis of single marker statistics to avoid the analysis of bad and least informative data, and limits the search space for finding interactions. Biological filtering includes targeting specific pathways, integrating various databases based on known biological and metabolic pathways, gene function ontology and protein-protein interactions. It is increasingly possible to target single-nucleotide polymorphisms that have defined functions on gene expression, though not belonging to protein-coding genes. Filtering can improve the power of an interaction association study, but also increases the chance of missing important findings.
UR - http://www.scopus.com/inward/record.url?scp=84901004089&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84901004089&partnerID=8YFLogxK
U2 - 10.3389/fgene.2014.00106
DO - 10.3389/fgene.2014.00106
M3 - Short survey
C2 - 24817878
AN - SCOPUS:84901004089
SN - 1664-8021
VL - 5
JO - Frontiers in Genetics
JF - Frontiers in Genetics
IS - APR
M1 - Article 106
ER -