TY - JOUR
T1 - Incorporating medical code descriptions for diagnosis prediction in healthcare
AU - Ma, Fenglong
AU - Wang, Yaqing
AU - Xiao, Houping
AU - Yuan, Ye
AU - Chitta, Radha
AU - Zhou, Jing
AU - Gao, Jing
N1 - Publisher Copyright:
© 2019 The Author(s).
PY - 2019/12/19
Y1 - 2019/12/19
N2 - Background: Diagnosis aims to predict the future health status of patients according to their historical electronic health records (EHR), which is an important yet challenging task in healthcare informatics. Existing diagnosis prediction approaches mainly employ recurrent neural networks (RNN) with attention mechanisms to make predictions. However, these approaches ignore the importance of code descriptions, i.e., the medical definitions of diagnosis codes. We believe that taking diagnosis code descriptions into account can help the state-of-the-art models not only to learn meaning code representations, but also to improve the predictive performance, especially when the EHR data are insufficient. Methods: We propose a simple, but general diagnosis prediction framework, which includes two basic components: diagnosis code embedding and predictive model. To learn the interpretable code embeddings, we apply convolutional neural networks (CNN) to model medical descriptions of diagnosis codes extracted from online medical websites. The learned medical embedding matrix is used to embed the input visits into vector representations, which are fed into the predictive models. Any existing diagnosis prediction approach (referred to as the base model) can be cast into the proposed framework as the predictive model (called the enhanced model). Results: We conduct experiments on two real medical datasets: the MIMIC-III dataset and the Heart Failure claim dataset. Experimental results show that the enhanced diagnosis prediction approaches significantly improve the prediction performance. Moreover, we validate the effectiveness of the proposed framework with insufficient EHR data. Finally, we visualize the learned medical code embeddings to show the interpretability of the proposed framework. Conclusions: Given the historical visit records of a patient, the proposed framework is able to predict the next visit information by incorporating medical code descriptions.
AB - Background: Diagnosis aims to predict the future health status of patients according to their historical electronic health records (EHR), which is an important yet challenging task in healthcare informatics. Existing diagnosis prediction approaches mainly employ recurrent neural networks (RNN) with attention mechanisms to make predictions. However, these approaches ignore the importance of code descriptions, i.e., the medical definitions of diagnosis codes. We believe that taking diagnosis code descriptions into account can help the state-of-the-art models not only to learn meaning code representations, but also to improve the predictive performance, especially when the EHR data are insufficient. Methods: We propose a simple, but general diagnosis prediction framework, which includes two basic components: diagnosis code embedding and predictive model. To learn the interpretable code embeddings, we apply convolutional neural networks (CNN) to model medical descriptions of diagnosis codes extracted from online medical websites. The learned medical embedding matrix is used to embed the input visits into vector representations, which are fed into the predictive models. Any existing diagnosis prediction approach (referred to as the base model) can be cast into the proposed framework as the predictive model (called the enhanced model). Results: We conduct experiments on two real medical datasets: the MIMIC-III dataset and the Heart Failure claim dataset. Experimental results show that the enhanced diagnosis prediction approaches significantly improve the prediction performance. Moreover, we validate the effectiveness of the proposed framework with insufficient EHR data. Finally, we visualize the learned medical code embeddings to show the interpretability of the proposed framework. Conclusions: Given the historical visit records of a patient, the proposed framework is able to predict the next visit information by incorporating medical code descriptions.
UR - http://www.scopus.com/inward/record.url?scp=85076968442&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076968442&partnerID=8YFLogxK
U2 - 10.1186/s12911-019-0961-2
DO - 10.1186/s12911-019-0961-2
M3 - Article
C2 - 31856806
AN - SCOPUS:85076968442
SN - 1472-6947
VL - 19
JO - BMC medical informatics and decision making
JF - BMC medical informatics and decision making
M1 - 267
ER -