ChatGPT-4 may provide accurate, reliable patient information on the Latarjet procedure

ByHunter Firment

Fact checked byGina Brockenbrough, MA

Key takeaways:

Results showed ChatGPT-4 used academic sources in 100% of its responses when queried about the Latarjet procedure.
Google used academic sources in 60% of its responses.

SAN FRANCISCO — According to results presented here, ChatGPT-4 may provide patients with accurate and reliable medical information on the Latarjet procedure for anterior shoulder instability.

“If large language models such as ChatGPT-4 do become implicated as mediums for patients to find health information or recommended sources by medical professionals, at least in this small sample that we investigated, the sources were reliable,” Kyle Kunze, MD, an orthopedic surgery resident at Hospital for Special Surgery, told Healio about results presented at the American Academy of Orthopaedic Surgeons Annual Meeting.

OT0224Kunze_AAOS_Graphic_01 — Data were derived from Kunze K, et al. Paper 216. Presented at: American Academy of Orthopaedic Surgeons Annual Meeting; Feb. 12-16, 2024; San Francisco.

To assess the propensity for ChatGPT-4 to provide medical information on the Latarjet procedure for anterior shoulder instability, Kunze and colleagues performed a search query of 10 frequently asked questions regarding the Latarjet procedure in both Google and ChatGPT-4.

Kyle Kunze

Kunze and colleagues compared the type of questions that Google and ChatGPT-4 asked, how similar the reported facts were and the type of sources utilized in responses.

“Finally, we also had blinded experts grade the accuracy and relevance of the information that it was providing as a gold standard,” Kunze said.

Overall, Kunze told Healio that Google and ChatGPT-4 both provided comparable, reliable and accurate medical information. However, he said that ChatGPT-4 was more likely to utilize academic sources such as peer reviewed literature in its responses.

According to the abstract, ChatGPT used academic sources in 100% of its responses, while Google used academic sources in 60% of its responses.

Additionally, Kunze said there was “little overlap” between Google and ChatGPT-4 when asked to provide a list of 10 frequently asked questions related to the Latarjet procedure.

“It will be important moving forward to investigate this further with larger ranges of questions and with different medical conditions to confirm that, at least in musculoskeletal health, ChatGPT is reliable for a wider range of conditions,” Kunze said. “With further validation and proof that ChatGPT is a reliable source, large language models can become integrated into clinical workflows and leveraged by providers to decrease some of the burden that they get in their daily practices.”

Published by:

Sources/Disclosures

Collapse

Source:

Kunze K, et al. Paper 216. Presented at: American Academy of Orthopaedic Surgeons Annual Meeting; Feb. 12-16, 2024; San Francisco.

Disclosures: Kunze reports no relevant financial disclosures.