Machine learning model accurately estimates PHQ-9 scores from clinical notes
Click Here to Manage Email Alerts
Key takeaways:
- A machine learning model processed data from more than 32,000 records, 96,000 encounters of those with MDD.
- Application of the model created an exponential amount of PHQ-9 scores and patient records.
MIAMI BEACH, Fla. — A novel machine learning model accurately estimated scores from a depression questionnaire from complete and partial clinical notes, per a poster at the American Society of Clinical Psychopharmacology annual meeting.
“In the real world, there’s a lot of missing data and [Patient Health Questionnaire-9] scores are commonly used and patients record them, but not everybody uses them every time,” Carl D. Marci, MD, chief psychiatrist and managing director of mental health and neurosciences at OM1 Inc., a Boston-based health care technology company, told Healio. “We wondered whether we could take a data science approach and take psychiatrists’ notes, have the computer analyze it, and generate a score.”
Marci and colleagues sought to apply a novel machine learning method to provide accurate estimated scores on the Patient Health Questionnaire-9 from a series of unorganized and partially organized clinical notes on those with depression.
They utilized data drawn from the OM1 PremiOM Major Depressive Disorder dataset, which contained information for over 490,000 individuals with MDD receiving treatment across the United States.
Those with MDD who had both a recorded PHQ-9 score and clinical notes of any kind (32,802 records; 96.891 patient encounters) were randomly assigned to a training cohort to fabricate a model, or to a validation cohort (15,792 records; 46,333 patient encounters).
Researchers employed area under the receiver-operating-characteristic curve (AUC) to assess the performance of each model, with continuous electronic PHQ-9 scores evaluated utilizing Spearman and Pearson R values typical in developing validated estimated endpoints in other medical specialties.
According to results, application of the model to the entire original dataset resulted in generation of ePHQ-9 scores for more than 2.2 million patient encounters — an increase of 2.7 times over the 814,166 recorded PHQ-9 scores — as well as encounters for 208,692 patients, representing an increase of 1.2 times the original 174,897 individuals with PHQ-9 scores.
“When you’re doing research in real-world data sets like this and you begin to define your cohort, you have a much higher chance of having a clinical endpoint,” Marci said.