Fine-tuning for accuracy: evaluation of Generative Pretrained Transformer (GPT) for automatic assignment of International Classification of Disease (ICD) codes to clinical documentation

Khalid Nawab, Madalyn Fernbach, Sayuj Atreya, Samina Asfandiyar, Gulalai Khan, Riya Arora, Iqbal Hussain, Shadi Hijjawi, Richard Schreiber

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Background: Assignment of International Classification of Disease (ICD) codes to clinical documentation is a tedious but important task that is mostly done manually. This study evaluated the widely popular OpenAI’s Generative Pretrained Transformer (GPT)-3.5 Turbo in facilitating the automation of assigning ICD codes to clinical notes. Methods: We identified the ten most prevalent ICD-10 codes in the Medical Information Mart for Intensive Care (MIMIC-IV) dataset. We selected 200 notes for each code, and then split them equally into two groups of 100 each (randomly selected) for training and testing. We then passed each note to GPT-3.5 Turbo via OpenAI’s Application Programming Interface, prompting the model to assign ICD-10 codes to each note. We evaluated the model’s response for the presence of the target ICD-10 code. After fine-tuning the GPT model on the training data, we repeated the process with the test data, comparing the fine-tuned model’s performance against the default model. Results: Initially the target ICD-10 code was present in the assigned codes by the default GPT-3.5 Turbo model in 29.7% of the cases. After fine-tuning with 100 notes for each top code, the accuracy improved to 62.6%. Conclusions: Historically, GPT’s performance for healthcare related tasks is sub-optimal. Fine-tuning as in this study provides great potential for improved performance, highlighting a path forward for integration of artificial intelligence in healthcare for improved efficiency and accuracy of this administrative task. Future research should focus on expanding the training datasets with specialized data and exploring the potential integration of these models into existing healthcare systems to maximize their utility and reliability.

Original languageEnglish (US)
Article number8
JournalJournal of Medical Artificial Intelligence
Volume7
Issue numberJune
DOIs
StatePublished - 2024

All Science Journal Classification (ASJC) codes

  • Medicine (miscellaneous)
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Fine-tuning for accuracy: evaluation of Generative Pretrained Transformer (GPT) for automatic assignment of International Classification of Disease (ICD) codes to clinical documentation'. Together they form a unique fingerprint.

Cite this