TY - JOUR
T1 - Incorporating inter-relationships between different levels of genomic data into cancer clinical outcome prediction
AU - Kim, Dokyoon
AU - Shin, Hyunjung
AU - Sohn, Kyung Ah
AU - Verma, Anurag
AU - Ritchie, Marylyn D.
AU - Kim, Ju Han
N1 - Funding Information:
This research was supported by a grant of the Korean Health Technology R&D Project, Ministry of Health and Welfare (H13C2164) and by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. 2010-0028631). MDR would like to gratefully acknowledge support by NIH grant 5R01 LM010040 and NHLBI grant 2U01 HL065962. In addition, we gratefully acknowledge the TCGA Consortium and all its members for the TCGA Project initiative, for providing samples, tissues, data processing and making data and results available.
PY - 2014/6/1
Y1 - 2014/6/1
N2 - In order to improve our understanding of cancer and develop multi-layered theoretical models for the underlying mechanism, it is essential to have enhanced understanding of the interactions between multiple levels of genomic data that contribute to tumor formation and progression. Although there exist recent approaches such as a graph-based framework that integrates multi-omics data including copy number alteration, methylation, gene expression, and miRNA data for cancer clinical outcome prediction, most of previous methods treat each genomic data as independent and the possible interplay between them is not explicitly incorporated to the model. However, cancer is dysregulated by multiple levels in the biological system through genomic, epigenomic, transcriptomic, and proteomic level. Thus, genomic features are likely to interact with other genomic features in the different genomic levels. In order to deepen our knowledge, it would be desirable to incorporate such inter-relationship information when integrating multi-omics data for cancer clinical outcome prediction. In this study, we propose a new graph-based framework that integrates not only multi-omics data but inter-relationship between them for better elucidating cancer clinical outcomes. In order to highlight the validity of the proposed framework, serous cystadenocarcinoma data from TCGA was adopted as a pilot task. The proposed model incorporating inter-relationship between different genomic features showed significantly improved performance compared to the model that does not consider inter-relationship when integrating multi-omics data. For the pair between miRNA and gene expression data, the model integrating miRNA, for example, gene expression, and inter-relationship between them with an AUC of 0.8476 (REI) outperformed the model combining miRNA and gene expression data with an AUC of 0.8404. Similar results were also obtained for other pairs between different levels of genomic data. Integration of different levels of data and inter-relationship between them can aid in extracting new biological knowledge by drawing an integrative conclusion from many pieces of information collected from diverse types of genomic data, eventually leading to more effective screening strategies and alternative therapies that may improve outcomes.
AB - In order to improve our understanding of cancer and develop multi-layered theoretical models for the underlying mechanism, it is essential to have enhanced understanding of the interactions between multiple levels of genomic data that contribute to tumor formation and progression. Although there exist recent approaches such as a graph-based framework that integrates multi-omics data including copy number alteration, methylation, gene expression, and miRNA data for cancer clinical outcome prediction, most of previous methods treat each genomic data as independent and the possible interplay between them is not explicitly incorporated to the model. However, cancer is dysregulated by multiple levels in the biological system through genomic, epigenomic, transcriptomic, and proteomic level. Thus, genomic features are likely to interact with other genomic features in the different genomic levels. In order to deepen our knowledge, it would be desirable to incorporate such inter-relationship information when integrating multi-omics data for cancer clinical outcome prediction. In this study, we propose a new graph-based framework that integrates not only multi-omics data but inter-relationship between them for better elucidating cancer clinical outcomes. In order to highlight the validity of the proposed framework, serous cystadenocarcinoma data from TCGA was adopted as a pilot task. The proposed model incorporating inter-relationship between different genomic features showed significantly improved performance compared to the model that does not consider inter-relationship when integrating multi-omics data. For the pair between miRNA and gene expression data, the model integrating miRNA, for example, gene expression, and inter-relationship between them with an AUC of 0.8476 (REI) outperformed the model combining miRNA and gene expression data with an AUC of 0.8404. Similar results were also obtained for other pairs between different levels of genomic data. Integration of different levels of data and inter-relationship between them can aid in extracting new biological knowledge by drawing an integrative conclusion from many pieces of information collected from diverse types of genomic data, eventually leading to more effective screening strategies and alternative therapies that may improve outcomes.
UR - http://www.scopus.com/inward/record.url?scp=84901411948&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84901411948&partnerID=8YFLogxK
U2 - 10.1016/j.ymeth.2014.02.003
DO - 10.1016/j.ymeth.2014.02.003
M3 - Article
C2 - 24561168
AN - SCOPUS:84901411948
SN - 1046-2023
VL - 67
SP - 344
EP - 353
JO - Methods
JF - Methods
IS - 3
ER -