ChatGPT offers high-quality, empathetic responses to patient messages
Click Here to Manage Email Alerts
Key takeaways:
- An AI chatbot generated higher quality responses to patient questions than physicians.
- The chatbot had a 9.8 times higher prevalence of empathetic responses.
An artificial intelligence chatbot generated higher quality, empathetic responses to patient questions, highlighting its potential as burnout-reducing technology in primary care practices, according to researchers.
With the rapid expansion of virtual health care, primary care physicians have recently seen a surge in patient messages and, subsequently, more work and burnout, John W. Ayers, PhD, MA, an associate adjunct professor of medicine at the University of California, San Diego, and colleagues wrote in JAMA Internal Medicine.
“The COVID-19 pandemic hastened the adoption of virtual health care, concomitant with a 1.6-fold increase in electronic patient messages, with each message adding 2.3 minutes of work in the electronic health record and more after-hours work,” Ayers and colleagues wrote. “Additional messaging volume predicts increased burnout for clinicians with 62% of physicians, a record high, reporting at least one burnout symptom.”
Artificial Intelligence (AI) presents a possible solution to this, they wrote, by drafting responses to patient questions that can be reviewed by clinicians.
“Current approaches to decreasing these message burdens include limiting notifications, billing for responses, or delegating responses to less trained support staff,” they wrote. “Unfortunately, these strategies can limit access to high-quality health care.”
ChatGPT vs. physician responses
The researchers conducted a cross-sectional study to assess if ChatGPT — a popular AI chatbot assistant that was released in November 2022 — can provide quality, empathetic responses to patient’s questions.
Ayers and colleagues used Reddit’s AskDocs forum to draw 195 exchanges in which a verified physician responded to a public question in October 2022. The researchers asked the chatbot the same questions. A team of licensed health care professionals then evaluated the initial question, alongside randomly ordered and anonymized chatbot and physician responses. The evaluators judged both empathy and quality on a scale from one to five to decide which response was better.
Ayers and colleagues found that ChatGPT generated high-quality, empathetic responses to patient questions, a finding that warrants further exploration of the technology in clinical settings, they wrote.
Of the 195 questions, evaluators chose chatbot responses over physician responses in 78.6% (95% CI, 75-81.8) of the 585 evaluations. Chatbot responses were significantly longer than physician responses — 211 vs. 52 words — and were rated significantly higher.
The proportion of responses that scored at least a four on the scale — good or very good quality — was higher for chatbot than physicians: 78.5% (95% CI, 72.3-84.1) versus 22.1% (95% CI, 16.4-28.2). The researchers wrote that this amounted to a 3.6 times higher prevalence of good or very good quality responses for the chatbot.
Notably, chatbot responses were also rated as significantly more empathetic than physician responses. The proportion of responses that scored at least a four on the scale — empathetic or very empathetic — was higher for the chatbot than physicians: 45.1% (95% CI, 38.5-51.8) vs. 4.6% (95% CI, 2.1-7.7). That equated to 9.8 times higher prevalence of empathetic responses for the chatbot.
“The present study should motivate research into the adoption of AI assistants for messaging, despite being previously overlooked,” Ayers and colleagues wrote. “For instance, as tested, chatbots could assist clinicians when messaging with patients by drafting a message based on a patient’s query for physicians or support staff to edit.”
This approach would fit with current message response strategies, the researchers wrote, in which teams of clinicians “often rely on canned responses or have support staff draft replies.”
“Such an AI-assisted approach could unlock untapped productivity so that clinical staff can use the time-savings for more complex tasks, resulting in more consistent responses and helping staff improve their overall communication skills by reviewing and modifying AI-written drafts,” they wrote.
Pandora’s box
In a related editorial, Ron Li, MD, a clinical assistant professor of medicine in the division of hospital medicine and Center for Biomedical Informatics Research at Stanford University School of Medicine, and colleagues wrote that, with the release of rapidly developing chatbots, “good or bad, ready or not, Pandora’s box has already been opened.”
Now, we are entering “a new era,” Li and colleagues wrote, with “a scarcity of time and human connection” but “an abundance of information.”
“The practice of medicine is much more than just processing information and associating words with concepts; it is ascribing meaning to those concepts while connecting with patients as a trusted partner to build healthier lives,” they wrote. “We can hope that emerging AI systems may help tame laborious tasks that overwhelm modern medicine and empower physicians to return our focus to treating human patients.”
References:
- Ayers JW, et al. JAMA Intern Med. 2023;doi:10.1001/jamainternmed.2023.1838.
- Li R, et al. JAMA Intern Med. 2023;doi:10.1001/jamainternmed.2023.1835.