Read more

October 16, 2023
1 min read
Save

Study evaluates potential of GPT-4 in prioritizing medical waiting lists

You've successfully added to your alerts. You will receive an email when new content is published.

Click Here to Manage Email Alerts

We were unable to process your request. Please try again later. If you continue to have this issue please contact customerservice@slackinc.com.

A study showed moderate agreement between GPT-4 and retina specialists in assessing need for treatment and urgency of treatment in patients with diabetic retinopathy, according to a poster presentation at Real World Ophthalmology.

Because diabetic retinopathy (DR) cases are on the rise and patient wait times can be long, language model (LM)-based patient prioritization tools (PPTs) could help support the prioritization decision process, Jericho Morcilla and co-authors wrote.

Retina
A study showed moderate agreement between GPT-4 and retina specialists in assessing need for treatment and urgency of treatment in patients with diabetic retinopathy, according to a poster presentation at Real World Ophthalmology.
Image: Adobe Stock

As a proof of principle for a broader applicability of LM-based PPTs, the study compared the patient priority ratings (PPR) and treatment urgency ratings (TUR) performed by GPT-4 and two board-certified retina specialists in four sets of anonymized DR patient profiles sorted by DR severity, central subfield thickness (CST), modified best corrected visual acuity and complete randomization.

There was moderate agreement between GPT-4 and the retina specialists in prioritizing TUR and PPR, which improved when the specialists’ decisions were averaged. The highest agreement occurred when the data set was organized by CST, showing that both GPT-4 and the specialists consider CST a key factor in their decisions. In cases with intentionally modified BCVA, however, only the human evaluators were able to detect the unrealistic scenarios, showing that GPT-4 does not have critical thinking abilities.

The inability of GPT-4 to acknowledge potential clinical improbabilities is, according to the authors, “a significant drawback” and suggests that “human evaluators may be more appropriate to guide management prioritization related to extraordinary patient cases.”