Incorporating medical code descriptions for diagnosis prediction in healthcare

Fenglong Ma, Yaqing Wang, Houping Xiao, Ye Yuan, Radha Chitta, Jing Zhou, Jing Gao

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


Background: Diagnosis aims to predict the future health status of patients according to their historical electronic health records (EHR), which is an important yet challenging task in healthcare informatics. Existing diagnosis prediction approaches mainly employ recurrent neural networks (RNN) with attention mechanisms to make predictions. However, these approaches ignore the importance of code descriptions, i.e., the medical definitions of diagnosis codes. We believe that taking diagnosis code descriptions into account can help the state-of-the-art models not only to learn meaning code representations, but also to improve the predictive performance, especially when the EHR data are insufficient. Methods: We propose a simple, but general diagnosis prediction framework, which includes two basic components: diagnosis code embedding and predictive model. To learn the interpretable code embeddings, we apply convolutional neural networks (CNN) to model medical descriptions of diagnosis codes extracted from online medical websites. The learned medical embedding matrix is used to embed the input visits into vector representations, which are fed into the predictive models. Any existing diagnosis prediction approach (referred to as the base model) can be cast into the proposed framework as the predictive model (called the enhanced model). Results: We conduct experiments on two real medical datasets: the MIMIC-III dataset and the Heart Failure claim dataset. Experimental results show that the enhanced diagnosis prediction approaches significantly improve the prediction performance. Moreover, we validate the effectiveness of the proposed framework with insufficient EHR data. Finally, we visualize the learned medical code embeddings to show the interpretability of the proposed framework. Conclusions: Given the historical visit records of a patient, the proposed framework is able to predict the next visit information by incorporating medical code descriptions.

Original languageEnglish (US)
Article number267
JournalBMC medical informatics and decision making
StatePublished - Dec 19 2019

All Science Journal Classification (ASJC) codes

  • Health Policy
  • Health Informatics


Dive into the research topics of 'Incorporating medical code descriptions for diagnosis prediction in healthcare'. Together they form a unique fingerprint.

Cite this