Gene-by-environment (G × E) interactions are important in explaining the missing heritability and understanding the causation of complex diseases, but a single, moderately sized study often has limited statistical power to detect such interactions. With the increasing need for integrating data and reporting results from multiple collaborative studies or sites, debate over choice between mega- versus meta-analysis continues. In principle, data from different sites can be integrated at the individual level into a “mega” data set, which can be fit by a joint “mega-analysis.” Alternatively, analyses can be done at each site, and results across sites can be combined through a “meta-analysis” procedure without integrating individual level data across sites. Although mega-analysis has been advocated in several recent initiatives, meta-analysis has the advantages of simplicity and feasibility, and has recently led to several important findings in identifying main genetic effects. In this paper, we conducted empirical and simulation studies, using data from a G × E study of lung cancer, to compare the mega- and meta-analyses in four commonly used G × E analyses under the scenario that the number of studies is small and sample sizes of individual studies are relatively large. We compared the two data integration approaches in the context of fixed effect models and random effects models separately. Our investigations provide valuable insights in understanding the differences between mega- and meta-analyses in practice of combining small number of studies in identifying G × E interactions.
All Science Journal Classification (ASJC) codes