Fact checked byKristen Dowd

Read more

September 11, 2023
1 min read
Save

ChatGPT performed at postgraduate year-1 level on Orthopaedic In-Training Examination

Fact checked byKristen Dowd
You've successfully added to your alerts. You will receive an email when new content is published.

Click Here to Manage Email Alerts

We were unable to process your request. Please try again later. If you continue to have this issue please contact customerservice@slackinc.com.

Key takeaways:

  • ChatGPT 3.5 correctly answered 54.3% of questions on the Orthopaedic In-Training Examination, corresponding to a PGY-1 level.
  • ChatGPT 4 correctly answered 76.6% of questions, corresponding to a PGY-5 level.
Perspective from Jonathan M. Vigdorchik, MD

ChatGPT correctly answered 54.3% of questions on the Orthopaedic In-Training Examination, a grade that corresponds with those from a postgraduate year-1 orthopedic resident, according to published results.

Researchers at Prisma Health-Midlands University of South Carolina School of Medicine analyzed the performance of ChatGPT version 3.5 and version 4 on the 2020, 2021 and 2022 Orthopaedic In-Training Examination with zero prompting. They compared the percentage of correct responses from ChatGPT with the national average of orthopedic surgery residents at each postgraduate year (PGY) level. Additionally, ChatGPT was required to provide a journal article, book or website as a verified source for its answer.

OT0923Kung_Graphic_01
Data were derived from Kung JE, et al. JBJS Open Access. 2023;doi:10.2106/JBJS.OA.23.00056.

ChatGPT 3.5 answered 196 of 360 questions (54.3%) correctly, a grade that corresponds with averages from PGY-1 residents, and cited a verifiable source on 47.2% of questions with an average median journal impact factor of 5.4.

ChatGPT 4 answered 265 of 360 questions correctly (73.6%), a grade that corresponds with averages from PGY-5 residents and exceeds the passing score of 67% for the American Board of Orthopaedic Surgery part I examination. ChatGPT 4 cited a verifiable source on 87.9% of questions with an average median journal impact factor of 5.2.

“Owing to the rapidly changing standard set by [artificial intelligence], it is important for orthopedic surgeons to be involved in the integration of [artificial intelligence] into this field and to guide it to a position where it can be used in providing excellent patient care,” the researchers wrote in the study.

“ChatGPT demonstrated comparable knowledge with that of orthopedic residents, and with further advancement, may possibly be used in orthopedic medical education, patient education and clinical decision-making,” they concluded.