TY - GEN
T1 - Generating linkage disequilibrium patterns in data simulations using genomeSIMLA
AU - Edwards, Todd L.
AU - Bush, William S.
AU - Turner, Stephen D.
AU - Dudek, Scott M.
AU - Torstenson, Eric S.
AU - Schmidt, Mike
AU - Martin, Eden
AU - Ritchie, Marylyn D.
PY - 2008
Y1 - 2008
N2 - Whole-genome association (WGA) studies are becoming a common tool for the exploration of the genetic components of common disease. The analysis of such large scale data presents unique analytical challenges, including problems of multiple testing, correlated independent variables, and large multivariate model spaces. These issues have prompted the development of novel computational approaches. Thorough, extensive simulation studies are a necessity for methods development work to evaluate the power and validity of novel approaches. Many data simulation packages exist, however, the resulting data is often overly simplistic and does not compare to the complexity of real data; especially with respect to linkage disequilibrium (LD). To overcome this limitation, we have developed genomeSIMLA. GenomeSIMLA is a forward-time population simulation method that can simulate realistic patterns of LD in both family-based and case-control datasets. In this manuscript, we demonstrate how LD patterns of the simulated data change under different population growth curve parameter initialization settings. These results provide guidelines to simulate WGA datasets whose properties resemble the HapMap.
AB - Whole-genome association (WGA) studies are becoming a common tool for the exploration of the genetic components of common disease. The analysis of such large scale data presents unique analytical challenges, including problems of multiple testing, correlated independent variables, and large multivariate model spaces. These issues have prompted the development of novel computational approaches. Thorough, extensive simulation studies are a necessity for methods development work to evaluate the power and validity of novel approaches. Many data simulation packages exist, however, the resulting data is often overly simplistic and does not compare to the complexity of real data; especially with respect to linkage disequilibrium (LD). To overcome this limitation, we have developed genomeSIMLA. GenomeSIMLA is a forward-time population simulation method that can simulate realistic patterns of LD in both family-based and case-control datasets. In this manuscript, we demonstrate how LD patterns of the simulated data change under different population growth curve parameter initialization settings. These results provide guidelines to simulate WGA datasets whose properties resemble the HapMap.
UR - http://www.scopus.com/inward/record.url?scp=47249130409&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47249130409&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-78757-0_3
DO - 10.1007/978-3-540-78757-0_3
M3 - Conference contribution
AN - SCOPUS:47249130409
SN - 3540787569
SN - 9783540787563
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 24
EP - 35
BT - Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics - 6th European Conference, EvoBIO 2008, Proceedings
T2 - 6th European Conference on Evolutionary Computation, Machine Learning, and Data Mining in Bioinformatics, EvoBIO 2008
Y2 - 26 March 2008 through 28 March 2008
ER -