TY - JOUR
T1 - Lung cancer survival period prediction and understanding
T2 - Deep learning approaches
AU - Doppalapudi, Shreyesh
AU - Qiu, Robin G.
AU - Badr, Youakim
N1 - Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2021/4
Y1 - 2021/4
N2 - Introduction: Survival period prediction through early diagnosis of cancer has many benefits. It allows both patients and caregivers to plan resources, time and intensity of care to provide the best possible treatment path for the patients. In this paper, by focusing on lung cancer patients, we build several survival prediction models using deep learning techniques to tackle both cancer survival classification and regression problems. We also conduct feature importance analysis to understand how lung cancer patients’ relevant factors impact their survival periods. We contribute to identifying an approach to estimate survivability that are commonly and practically appropriate for medical use. Methodologies: We have compared the performance across three of the most popular deep learning architectures - Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN) while comparing the performing of deep learning models against traditional machine learning models. The data was obtained from the lung cancer section of Surveillance, Epidemiology, and End Results (SEER) cancer registry. Results: The deep learning models outperformed traditional machine learning models across both classification and regression approaches. We obtained a best of 71.18 % accuracy for the classification approach when patients’ survival periods are segmented into classes of ‘<=6 months’,’ 0.5 – 2 years’ and ‘>2 years’ and Root Mean Squared Error (RMSE) of 13.5 % andR2 value of 0.5 for the regression approach for the deep learning models while the traditional machine learning models saturated at 61.12 % classification accuracy and 14.87 % RMSE in regression. Conclusions: This approach can be a baseline for early prediction with predictions that can be further improved with more temporal treatment information collected from treated patients. In addition, we evaluated the feature importance to investigate the model interpretability, gaining further insight into the survival analysis models and the factors that are important in cancer survival period prediction.
AB - Introduction: Survival period prediction through early diagnosis of cancer has many benefits. It allows both patients and caregivers to plan resources, time and intensity of care to provide the best possible treatment path for the patients. In this paper, by focusing on lung cancer patients, we build several survival prediction models using deep learning techniques to tackle both cancer survival classification and regression problems. We also conduct feature importance analysis to understand how lung cancer patients’ relevant factors impact their survival periods. We contribute to identifying an approach to estimate survivability that are commonly and practically appropriate for medical use. Methodologies: We have compared the performance across three of the most popular deep learning architectures - Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN) while comparing the performing of deep learning models against traditional machine learning models. The data was obtained from the lung cancer section of Surveillance, Epidemiology, and End Results (SEER) cancer registry. Results: The deep learning models outperformed traditional machine learning models across both classification and regression approaches. We obtained a best of 71.18 % accuracy for the classification approach when patients’ survival periods are segmented into classes of ‘<=6 months’,’ 0.5 – 2 years’ and ‘>2 years’ and Root Mean Squared Error (RMSE) of 13.5 % andR2 value of 0.5 for the regression approach for the deep learning models while the traditional machine learning models saturated at 61.12 % classification accuracy and 14.87 % RMSE in regression. Conclusions: This approach can be a baseline for early prediction with predictions that can be further improved with more temporal treatment information collected from treated patients. In addition, we evaluated the feature importance to investigate the model interpretability, gaining further insight into the survival analysis models and the factors that are important in cancer survival period prediction.
UR - http://www.scopus.com/inward/record.url?scp=85100143757&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100143757&partnerID=8YFLogxK
U2 - 10.1016/j.ijmedinf.2020.104371
DO - 10.1016/j.ijmedinf.2020.104371
M3 - Article
C2 - 33461009
AN - SCOPUS:85100143757
SN - 1386-5056
VL - 148
JO - International Journal of Medical Informatics
JF - International Journal of Medical Informatics
M1 - 104371
ER -