It has been recognized that genetic mutations in specific nucleotides may give rise to cancer via the alteration of signaling pathways. Thus, the detection of those cancer-causing mutations has received considerable interest in cancer genetic research. Here, we propose a statistical model for characterizing genes that lead to cancer through point mutations using genome-wide single nucleotide polymorphism (SNP) data. The basic idea of the model is that mutated genes may be in high association with their nearby SNPs because of evolutionary forces. By genotyping SNPs in both normal and cancer cells, we formulate a polynomial likelihood to estimate the population genetic parameters related to cancer, such as allele frequencies of cancer-causing alleles, mutation rates of alleles derived from maternal or paternal parents, and zygotic linkage disequilibria between different loci after the mutation occurs. We implement the EM algorithm to estimate some of these parameters because of the missing information in the likelihood construction. The model allows the elegant tests of the significant associations between mutated cancer genes and genome-wide SNPs, thus providing a way for predicting the occurrence and formation of cancer with genetic information. The model, validated through computer simulation, may help cancer geneticists design efficient experiments and formulate hypotheses for cancer gene identification.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Modeling and Simulation
- Biochemistry, Genetics and Molecular Biology(all)
- Immunology and Microbiology(all)
- Agricultural and Biological Sciences(all)
- Applied Mathematics