Bayesian modeling for genetic association in case-control studies: Accounting for unknown population substructure

Li Zhang, Bhramar Mukherjee, Malay Ghosh, Rongling Wu

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


A two-stage parametric Bayesian method is proposed to examine the association between a candidate gene and the occurrence of a disease after accounting for population substructure. This procedure, implemented via a Markov chain Monte Carlo numerical integration technique, first estimates the posterior probability of different unknown population substructures and then integrates this information into a disease-gene association model through the technique of Bayesian model averaging. The model relaxes certain assumptions of previous analyses and provides a unified computational framework to obtain an estimate of the log odds ratio parameter corresponding to the genetic factor after allowing for the allele frequencies to vary across subpopulations. The uncertainty in estimating the population substructure is taken into account while providing credible intervals for parameters in the disease-gene association model. Simulations on unmatched case-control studies that mimic an admixed Argentinean population are performed to demonstrate the statistical properties of our model. The method is also applied to a real data set coming from a genetic association study on obesity.

Original languageEnglish (US)
Pages (from-to)352-372
Number of pages21
JournalStatistical Modelling
Issue number4
StatePublished - Dec 2006

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Bayesian modeling for genetic association in case-control studies: Accounting for unknown population substructure'. Together they form a unique fingerprint.

Cite this