TY - JOUR
T1 - Bayesian modeling for genetic association in case-control studies
T2 - Accounting for unknown population substructure
AU - Zhang, Li
AU - Mukherjee, Bhramar
AU - Ghosh, Malay
AU - Wu, Rongling
N1 - Copyright:
Copyright 2007 Elsevier B.V., All rights reserved.
PY - 2006/12
Y1 - 2006/12
N2 - A two-stage parametric Bayesian method is proposed to examine the association between a candidate gene and the occurrence of a disease after accounting for population substructure. This procedure, implemented via a Markov chain Monte Carlo numerical integration technique, first estimates the posterior probability of different unknown population substructures and then integrates this information into a disease-gene association model through the technique of Bayesian model averaging. The model relaxes certain assumptions of previous analyses and provides a unified computational framework to obtain an estimate of the log odds ratio parameter corresponding to the genetic factor after allowing for the allele frequencies to vary across subpopulations. The uncertainty in estimating the population substructure is taken into account while providing credible intervals for parameters in the disease-gene association model. Simulations on unmatched case-control studies that mimic an admixed Argentinean population are performed to demonstrate the statistical properties of our model. The method is also applied to a real data set coming from a genetic association study on obesity.
AB - A two-stage parametric Bayesian method is proposed to examine the association between a candidate gene and the occurrence of a disease after accounting for population substructure. This procedure, implemented via a Markov chain Monte Carlo numerical integration technique, first estimates the posterior probability of different unknown population substructures and then integrates this information into a disease-gene association model through the technique of Bayesian model averaging. The model relaxes certain assumptions of previous analyses and provides a unified computational framework to obtain an estimate of the log odds ratio parameter corresponding to the genetic factor after allowing for the allele frequencies to vary across subpopulations. The uncertainty in estimating the population substructure is taken into account while providing credible intervals for parameters in the disease-gene association model. Simulations on unmatched case-control studies that mimic an admixed Argentinean population are performed to demonstrate the statistical properties of our model. The method is also applied to a real data set coming from a genetic association study on obesity.
UR - http://www.scopus.com/inward/record.url?scp=33847414433&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33847414433&partnerID=8YFLogxK
U2 - 10.1177/1471082006071841
DO - 10.1177/1471082006071841
M3 - Article
AN - SCOPUS:33847414433
SN - 1471-082X
VL - 6
SP - 352
EP - 372
JO - Statistical Modelling
JF - Statistical Modelling
IS - 4
ER -