TY - GEN
T1 - COE
T2 - 13th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2009
AU - Zhang, Xiang
AU - Pan, Feng
AU - Xie, Yuying
AU - Zou, Fei
AU - Wang, Wei
PY - 2009
Y1 - 2009
N2 - The availability of high density single nucleotide polymorphisms (SNPs) data has made genome-wide association study computationally challenging. Two-locus epistasis (gene-gene interaction) detection has attracted great research interest as a promising method for genetic analysis of complex diseases. In this paper, we propose a general approach, COE, for efficient large scale genegene interaction analysis, which supports a wide range of tests. In particular, we show that many commonly used statistics are convex functions. From the observed values of the events in two-locus association test, we can develop an upper bound of the test value. Such an upper bound only depends on single-locus test and the genotype of the SNP-pair. We thus group and index SNP-pairs by their genotypes. This indexing structure can benefit the computation of all convexstatistics. Utilizing the upper bound and the indexing structure, we can prune most of the SNP-pairs without compromising the optimality of the result. Our approach is especially efficient for large permutation test. Extensive experiments demonstrate that our approach provides orders of magnitude performance improvement over the brute force approach.
AB - The availability of high density single nucleotide polymorphisms (SNPs) data has made genome-wide association study computationally challenging. Two-locus epistasis (gene-gene interaction) detection has attracted great research interest as a promising method for genetic analysis of complex diseases. In this paper, we propose a general approach, COE, for efficient large scale genegene interaction analysis, which supports a wide range of tests. In particular, we show that many commonly used statistics are convex functions. From the observed values of the events in two-locus association test, we can develop an upper bound of the test value. Such an upper bound only depends on single-locus test and the genotype of the SNP-pair. We thus group and index SNP-pairs by their genotypes. This indexing structure can benefit the computation of all convexstatistics. Utilizing the upper bound and the indexing structure, we can prune most of the SNP-pairs without compromising the optimality of the result. Our approach is especially efficient for large permutation test. Extensive experiments demonstrate that our approach provides orders of magnitude performance improvement over the brute force approach.
UR - http://www.scopus.com/inward/record.url?scp=67650308781&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67650308781&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-02008-7_19
DO - 10.1007/978-3-642-02008-7_19
M3 - Conference contribution
AN - SCOPUS:67650308781
SN - 9783642020070
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 253
EP - 269
BT - Research in Computational Molecular Biology - 13th Annual International Conference, RECOMB 2009, Proceedings
Y2 - 18 May 2009 through 21 May 2009
ER -