TY - JOUR
T1 - Domain generalization for enhanced predictions of hospital readmission on unseen domains among patients with diabetes
AU - Hai, Ameen Abdel
AU - Weiner, Mark G.
AU - Livshits, Alice
AU - Brown, Jeremiah R.
AU - Paranjape, Anuradha
AU - Hwang, Wenke
AU - Kirchner, Lester H.
AU - Mathioudakis, Nestoras
AU - French, Esra Karslioglu
AU - Obradovic, Zoran
AU - Rubin, Daniel J.
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/12
Y1 - 2024/12
N2 - A prediction model to assess the risk of hospital readmission can be valuable to identify patients who may benefit from extra care. Developing hospital-specific readmission risk prediction models using local data is not feasible for many institutions. Models developed on data from one hospital may not generalize well to another hospital. There is a lack of an end-to-end adaptable readmission model that can generalize to unseen test domains. We propose an early readmission risk domain generalization network, ERR-DGN, for cross-domain knowledge transfer. ERR-DGN internalizes the shared patterns and characteristics that are consistent across source domains, enabling it to adapt to a new domain. It transforms source datasets to a common embedding space while capturing relevant temporal long-term dependencies of sequential data. Domain generalization is then applied on domain-specific fully connected linear layers. The model is optimized by a loss function that integrates distribution discrepancy loss to match the mean embeddings of multiple source distributions with the task-specific loss. A model was developed using electronic health record (EHR) data of 201,688 patients with diabetes across urban, suburban, rural, and mixed hospital systems to enhance 30-day readmission predictions among patients with diabetes on 67,066 unseen patients at a rural hospital. We also explored how model performance varied by the number of sites and over time. The proposed method outperformed the baseline models, yielding a 6 % increase in F1-score (0.79 ± 0.006 vs. 0.73 ± 0.007). Model performance peaked with the inclusion of three sites. Performance of the model was relatively stable for 3 years then declined at 4 years. ERR-DGN may be a proficient tool for learning data from multiple sites and subsequently applying a hospitalization readmission prediction model to a new site. Including a relatively small number of varied sites may be sufficient to achieve peak performance. Periodic retraining at least every 3 years may mitigate model degradation over time.
AB - A prediction model to assess the risk of hospital readmission can be valuable to identify patients who may benefit from extra care. Developing hospital-specific readmission risk prediction models using local data is not feasible for many institutions. Models developed on data from one hospital may not generalize well to another hospital. There is a lack of an end-to-end adaptable readmission model that can generalize to unseen test domains. We propose an early readmission risk domain generalization network, ERR-DGN, for cross-domain knowledge transfer. ERR-DGN internalizes the shared patterns and characteristics that are consistent across source domains, enabling it to adapt to a new domain. It transforms source datasets to a common embedding space while capturing relevant temporal long-term dependencies of sequential data. Domain generalization is then applied on domain-specific fully connected linear layers. The model is optimized by a loss function that integrates distribution discrepancy loss to match the mean embeddings of multiple source distributions with the task-specific loss. A model was developed using electronic health record (EHR) data of 201,688 patients with diabetes across urban, suburban, rural, and mixed hospital systems to enhance 30-day readmission predictions among patients with diabetes on 67,066 unseen patients at a rural hospital. We also explored how model performance varied by the number of sites and over time. The proposed method outperformed the baseline models, yielding a 6 % increase in F1-score (0.79 ± 0.006 vs. 0.73 ± 0.007). Model performance peaked with the inclusion of three sites. Performance of the model was relatively stable for 3 years then declined at 4 years. ERR-DGN may be a proficient tool for learning data from multiple sites and subsequently applying a hospitalization readmission prediction model to a new site. Including a relatively small number of varied sites may be sufficient to achieve peak performance. Periodic retraining at least every 3 years may mitigate model degradation over time.
UR - http://www.scopus.com/inward/record.url?scp=85208970044&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85208970044&partnerID=8YFLogxK
U2 - 10.1016/j.artmed.2024.103010
DO - 10.1016/j.artmed.2024.103010
M3 - Article
C2 - 39556977
AN - SCOPUS:85208970044
SN - 0933-3657
VL - 158
JO - Artificial Intelligence in Medicine
JF - Artificial Intelligence in Medicine
M1 - 103010
ER -