TY - JOUR
T1 - Regression hidden Markov modeling reveals heterogeneous gene expression regulation
T2 - A case study in mouse embryonic stem cells
AU - Lee, Yeonok
AU - Ghosh, Debashis
AU - Zhang, Yu
N1 - Funding Information:
The project was supported in part by grants NIH UL1TR000127, R01 CA129102, R01 HG004718, and NSF ABI-1262538. Publication of this manuscript is supported by grant R01 HG004718. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. The authors thank Chao Cheng who provided gene expression data.
PY - 2014/5/12
Y1 - 2014/5/12
N2 - Background: Studies have shown the strong association between histone modification levels and gene expression levels. The detailed relationships between the two can vary substantially due to differential regulation, and hence a simple regression model may not be adequate. We apply a regression hidden Markov model (regHMM) to further investigate the potential multiple relationships between genes and histone methylation levels in mouse embryonic stem cells.Results: Seven histone methylation levels are used in the study. Averaged histone modifications over non-overlapping 200 bp windows on the range transcription starting site (TSS) ± 1 Kb are used as predictors, and in total 70 explanatory variables are generated. Based on regHMM results, genes segregated into two groups, referred to as State 1 and State 2, have distinct association strengths. Genes in State 1 are better explained by histone methylation levels with R2=72 while those in State 2 have weaker association strength with R2=38. The regression coefficients in the two states are not very different in magnitude except in the intercept,.25 and 1.15 for State 1 and State 2, respectively. We found specific GO categories that may be attributed to the different relationships. The GO categories more frequently observed in State 2 match those of housekeeping genes, such as cytoplasm, nucleus, and protein binding. In addition, the housekeeping gene expression levels are significantly less explained by histone methylation in mouse embryonic stem cells, which is consistent with the constitutive expression patterns that would be expected.Conclusion: Gene expression levels are not universally affected by histone methylation levels, and the relationships between the two differ by the gene functions. The expression levels of the genes that perform the most common housekeeping genes' GO categories are less strongly associated with histone methylation levels. We suspect that additional biological factors may also be strongly associated with the gene expression levels in State 2. We discover that the effect of the presence of CpG island in TSS ± 1 Kb is larger in State 2.
AB - Background: Studies have shown the strong association between histone modification levels and gene expression levels. The detailed relationships between the two can vary substantially due to differential regulation, and hence a simple regression model may not be adequate. We apply a regression hidden Markov model (regHMM) to further investigate the potential multiple relationships between genes and histone methylation levels in mouse embryonic stem cells.Results: Seven histone methylation levels are used in the study. Averaged histone modifications over non-overlapping 200 bp windows on the range transcription starting site (TSS) ± 1 Kb are used as predictors, and in total 70 explanatory variables are generated. Based on regHMM results, genes segregated into two groups, referred to as State 1 and State 2, have distinct association strengths. Genes in State 1 are better explained by histone methylation levels with R2=72 while those in State 2 have weaker association strength with R2=38. The regression coefficients in the two states are not very different in magnitude except in the intercept,.25 and 1.15 for State 1 and State 2, respectively. We found specific GO categories that may be attributed to the different relationships. The GO categories more frequently observed in State 2 match those of housekeeping genes, such as cytoplasm, nucleus, and protein binding. In addition, the housekeeping gene expression levels are significantly less explained by histone methylation in mouse embryonic stem cells, which is consistent with the constitutive expression patterns that would be expected.Conclusion: Gene expression levels are not universally affected by histone methylation levels, and the relationships between the two differ by the gene functions. The expression levels of the genes that perform the most common housekeeping genes' GO categories are less strongly associated with histone methylation levels. We suspect that additional biological factors may also be strongly associated with the gene expression levels in State 2. We discover that the effect of the presence of CpG island in TSS ± 1 Kb is larger in State 2.
UR - http://www.scopus.com/inward/record.url?scp=84903513491&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84903513491&partnerID=8YFLogxK
U2 - 10.1186/1471-2164-15-360
DO - 10.1186/1471-2164-15-360
M3 - Article
C2 - 24884369
AN - SCOPUS:84903513491
SN - 1471-2164
VL - 15
JO - BMC genomics
JF - BMC genomics
IS - 1
M1 - 360
ER -