Large-scale studies of genetic variation may be helpful for understanding the genetic control mechanisms of viral infection and, ultimately, predicting and eliminating infectious disease outbreaks. We propose a new statistical model for detecting specific DNA sequence variants that are responsible for viral infection. This model considers additive, dominance and epistatic effects of haplotypes from three different genomes, recipient, transmitter and virus, through an epidemiological process. The model is constructed within the maximum likelihood framework and implemented with the EM algorithm. A number of hypothesis tests about population genetic structure and diversity and the pattern of genetic control are formulated. A series of closed forms for the EM algorithm to estimate haplotype frequencies and haplotype effects in a network of genetic interactions among three genomes are derived. Simulation studies were performed to test the statistical properties of the model, recommending necessary sample sizes for obtaining reasonably good accuracy and precision of parameter estimation. By integrating, for the first time, the epidemiological principle of viral infection into genetic mapping, the new model shall find an immediate application to studying the genetic architecture of viral infection.
|Statistical Applications in Genetics and Molecular Biology
|Published - 2009
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Molecular Biology
- Computational Mathematics