AI chatbots provide ‘insufficient’ responses for some questions about endometriosis
Key takeaways:
- Several AI chatbots provided mostly accurate responses when asked common questions about endometriosis.
- The chatbots had difficulty with questions about endometriosis treatment and recurrence.
AI chatbots provided mostly accurate responses to endometriosis questions, but these resources were not comprehensive or accurate enough to replace counseling by health care experts, researchers reported in AJOG Global Reports.
“We know that patients are using AI chatbots to get information about endometriosis and their health conditions, [but] there are still gaps in what [responses] the chatbots generated and standard of care in professional guidelines,” Kimberly A. Kho, MD, professor in the department of obstetrics and gynecology, associate chief of gynecology and director of the Minimally Invasive Gynecologic Surgery Fellowship Program at the University of Texas Southwestern Medical Center, told Healio.

Cohen and colleagues asked ChatGPT-4, Claude and Bard large language model chatbots to generate responses to 10 commonly asked questions about endometriosis, which were then compared with current guidelines and expert opinion. Gynecologists then rated responses as completely incorrect, mostly incorrect, mostly correct, correct but inadequate or correct and comprehensive.
On average, scores for all 10 answers were 3.69, 4.24 and 3.7 for Bard, ChatGPT and Claude, respectively. No responses were rated as “comprehensive and correct,” and chatbots provided at least “mostly correct” responses for all but two questions. Researchers observed more difficulties in chatbot responses to questions about endometriosis treatment and risk of recurrence compared with symptoms and pathophysiology.
ChatGPT was the chatbot most associated with “comprehensive and correct” responses.
“People who are seeking this information need to understand the limitations of what is available,” Kho told Healio. “Beyond more basic questions like ‘What is endometriosis?’ we started to see that the [chatbot] outputs were more insufficient for questions about treatment recommendations. Providers with expertise in endometriosis need to be involved in creating the content that AI chatbots generate. Chatbots can only put out whatever is input into it. The chatbots learn from what is available on the internet and from what people input into it, so at this point, they lack the context of personalized patient information and clinical experience that medical specialists have.”
According to Kho, chatbots should never replace health care expert counseling. Instead, patients could use chatbots to prepare for in-person physician counseling.
“If we can trust most of the information that patients are getting from chatbots, we can move from the ground level to a much more elevated level of discussion,” Kho told Healio. ”Right now, we spend a lot of time educating about what endometriosis is, with little time for detailed discussion within the limits of a clinical visit. If patients can use the tools to gain more of this information beforehand, we can elevate the conversations at their visits,” Kho told Healio.
For more information:
Kimberly A. Kho, MD, can be reached at kimberly.kho@UTSouthwestern.edu; Instagram: @Kimberly_Kho_MD and @UTSWMedCenter and X (Twitter): @UTSWNews.