TY - JOUR
T1 - MB-SupCon
T2 - Microbiome-based Predictive Models via Supervised Contrastive Learning
AU - Yang, Sen
AU - Wang, Shidan
AU - Wang, Yiqing
AU - Rong, Ruichen
AU - Kim, Jiwoong
AU - Li, Bo
AU - Koh, Andrew Y.
AU - Xiao, Guanghua
AU - Li, Qiwei
AU - Liu, Dajiang J.
AU - Zhan, Xiaowei
N1 - Publisher Copyright:
© 2022
PY - 2022/8/15
Y1 - 2022/8/15
N2 - Human microbiome consists of trillions of microorganisms. Microbiota can modulate the host physiology through molecule and metabolite interactions. Integrating microbiome and metabolomics data have the potential to predict different diseases more accurately. Yet, most datasets only measure microbiome data but without paired metabolome data. Here, we propose a novel integrative modeling framework, Microbiome-based Supervised Contrastive Learning Framework (MB-SupCon). MB-SupCon integrates microbiome and metabolome data to generate microbiome embeddings, which can be used to improve the prediction accuracy in datasets that only measure microbiome data. As a proof of concept, we applied MB-SupCon on 720 samples with paired 16S microbiome data and metabolomics data from patients with type 2 diabetes. MB-SupCon outperformed existing prediction methods and achieved high average prediction accuracies for insulin resistance status (84.62%), sex (78.98%), and race (80.04%). Moreover, the microbiome embeddings form separable clusters for different covariate groups in the lower-dimensional space, which enhances data visualization. We also applied MB-SupCon on a large inflammatory bowel disease study and observed similar advantages. Thus, MB-SupCon could be broadly applicable to improve microbiome prediction models in multi-omics disease studies.
AB - Human microbiome consists of trillions of microorganisms. Microbiota can modulate the host physiology through molecule and metabolite interactions. Integrating microbiome and metabolomics data have the potential to predict different diseases more accurately. Yet, most datasets only measure microbiome data but without paired metabolome data. Here, we propose a novel integrative modeling framework, Microbiome-based Supervised Contrastive Learning Framework (MB-SupCon). MB-SupCon integrates microbiome and metabolome data to generate microbiome embeddings, which can be used to improve the prediction accuracy in datasets that only measure microbiome data. As a proof of concept, we applied MB-SupCon on 720 samples with paired 16S microbiome data and metabolomics data from patients with type 2 diabetes. MB-SupCon outperformed existing prediction methods and achieved high average prediction accuracies for insulin resistance status (84.62%), sex (78.98%), and race (80.04%). Moreover, the microbiome embeddings form separable clusters for different covariate groups in the lower-dimensional space, which enhances data visualization. We also applied MB-SupCon on a large inflammatory bowel disease study and observed similar advantages. Thus, MB-SupCon could be broadly applicable to improve microbiome prediction models in multi-omics disease studies.
UR - http://www.scopus.com/inward/record.url?scp=85133714609&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133714609&partnerID=8YFLogxK
U2 - 10.1016/j.jmb.2022.167693
DO - 10.1016/j.jmb.2022.167693
M3 - Article
C2 - 35777465
AN - SCOPUS:85133714609
SN - 0022-2836
VL - 434
JO - Journal of Molecular Biology
JF - Journal of Molecular Biology
IS - 15
M1 - 167693
ER -