TY - JOUR
T1 - A Machine Learning-Aided Framework to Predict Outcomes of Anti-PD-1 Therapy for Patients with Gynecological Cancer on Incomplete Post-Marketing Surveillance Dataset
AU - Liu, Xiaomei
AU - Xiao, Zhifeng
AU - Song, Yang
AU - Zhang, Ruizhe
AU - Li, Xiuqin
AU - Du, Zhenhua
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2021
Y1 - 2021
N2 - Post-marketing surveillance of antineoplastic agents is performed to evaluate the efficacy and safety in patients aiming at expanding drug indications and discovering potential adverse events. The real-world data is fraught with missing values. Literature addressing different strategies for dealing with missing data in such a situation is scarce. Using machine learning (ML) algorithms for predicting therapeutic outcomes of PD-1/PD-L1 Inhibitors has attracted attention. However, training a predictive model usually requires imaging or biomarker information, which is rarely available in the post-marketing surveillance data. To address these challenges, we propose an ML-aided framework to predict the outcomes of Anti-PD-1 therapy for gynecological malignancy on a dataset with 117 patient samples, treated by Camrelizumab (with 50 patient samples), Sintilimab (44), and Toripalimab (23). Four therapeutic outcomes, including Response Evaluation Criteria in Solid Tumours (RECIST), organ adverse effect (AE), general AE, and death, are predicted. The proposed framework feeds the dataset into a learning pipeline consisting of imputation, feature engineering, model training, ensemble learning, and model selection to generate the final predictive model. We conduct experiments to justify several critical design choices, such as the specific feature engineering strategies and the SMOTE over-sampling technique. The final model for each learning task is selected from a large pool of model candidates based on a joint consideration of accuracy and F1. Moreover, we conduct thorough and visualized model analysis and gain a deeper understanding of model behavior and feature importance. The results, analysis, and findings demonstrate the superiority of the proposed learning-aided framework.
AB - Post-marketing surveillance of antineoplastic agents is performed to evaluate the efficacy and safety in patients aiming at expanding drug indications and discovering potential adverse events. The real-world data is fraught with missing values. Literature addressing different strategies for dealing with missing data in such a situation is scarce. Using machine learning (ML) algorithms for predicting therapeutic outcomes of PD-1/PD-L1 Inhibitors has attracted attention. However, training a predictive model usually requires imaging or biomarker information, which is rarely available in the post-marketing surveillance data. To address these challenges, we propose an ML-aided framework to predict the outcomes of Anti-PD-1 therapy for gynecological malignancy on a dataset with 117 patient samples, treated by Camrelizumab (with 50 patient samples), Sintilimab (44), and Toripalimab (23). Four therapeutic outcomes, including Response Evaluation Criteria in Solid Tumours (RECIST), organ adverse effect (AE), general AE, and death, are predicted. The proposed framework feeds the dataset into a learning pipeline consisting of imputation, feature engineering, model training, ensemble learning, and model selection to generate the final predictive model. We conduct experiments to justify several critical design choices, such as the specific feature engineering strategies and the SMOTE over-sampling technique. The final model for each learning task is selected from a large pool of model candidates based on a joint consideration of accuracy and F1. Moreover, we conduct thorough and visualized model analysis and gain a deeper understanding of model behavior and feature importance. The results, analysis, and findings demonstrate the superiority of the proposed learning-aided framework.
UR - http://www.scopus.com/inward/record.url?scp=85113877165&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85113877165&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2021.3107498
DO - 10.1109/ACCESS.2021.3107498
M3 - Article
AN - SCOPUS:85113877165
SN - 2169-3536
VL - 9
SP - 120464
EP - 120480
JO - IEEE Access
JF - IEEE Access
M1 - 9521878
ER -