Ethical and Professional Decision-Making Capabilities of Artificial Intelligence Chatbots: Evaluating ChatGPT’s Professional Competencies in Medicine

John C. Lin, Sai S. Kurapati, David N. Younessi, Ingrid U. Scott, Dan A. Gong

Research output: Contribution to journalArticlepeer-review

Abstract

Purpose: We examined the performance of artificial intelligence chatbots on the PREview Practice Exam, an online situational judgment test for professionalism and ethics. Methods: We used validated methodologies to calculate scores and descriptive statistics, χ2 tests, and Fisher’s exact tests to compare scores by model and competency. Results: GPT-3.5 and GPT-4 scored 6/9 (76th percentile) and 7/9 (92nd percentile), respectively, higher than medical school applicant averages of 5/9 (56th percentile). Both models answered 95 + % of questions correctly. Conclusions: Chatbots outperformed the average applicant on PREview, suggesting their potential for healthcare training and decision-making and highlighting risks of online assessment delivery.

Original languageEnglish (US)
JournalMedical Science Educator
DOIs
StateAccepted/In press - 2024

All Science Journal Classification (ASJC) codes

  • Medicine (miscellaneous)
  • Education

Cite this