Fact checked byHeather Biele

Read more

February 27, 2025
3 min read
Save

Machine learning models help to predict progression to schizophrenia, bipolar disorder

Fact checked byHeather Biele

Key takeaways:

  • The best model for predicting schizophrenia performed substantially better than the best bipolar disorder-predicting model.
  • Text-based features from clinical notes improved the models’ predictive powers.

Machine learning models trained on electronic health record data were able to successfully predict diagnostic progression to schizophrenia and bipolar disorder among patients with other mental illnesses, according to study results.

However, the models showed better performance in predicting schizophrenia than bipolar disorder, results of the study, published in JAMA Psychiatry, showed.

Psych0225Hansen_Graphic_01
Data were derived from Hansen L, et al. JAMA Psychiatry. 2025;doi:10.1001/jamapsychiatry.2024.4702.

Diagnosis of schizophrenia and bipolar disorder is often delayed due to symptom overlap with other disorders or lack of full diagnostic criteria fulfillment, delaying treatment and potentially worsening prognosis.

Clinical notes recorded in EHRs may serve as a helpful diagnostic tool, providing information on symptoms, treatment responses and patient-physician interactions.

“Using methods from natural language processing and deep learning, it may be possible to extract and synthesize data from clinical notes, uncovering patterns that could indicate an impending progression from less severe conditions to schizophrenia or bipolar disorder,” Lasse Hansen, MSc, PhD, of the department of clinical medicine at Aarhus University, and colleagues wrote.

This inspired the researchers to conduct a retrospective cohort study to determine if machine learning models trained on routine clinical data from EHRs can accurately predict whether patients currently receiving psychiatric treatment are at risk for progression to schizophrenia or bipolar disorder within 5 years.

The cohort included 24,449 patients (median age, 32.2 years; interquartile range, 24.2-42.5 years; 56.6% women) aged 15 to 60 years who had at least two contacts — a minimum of 3 months apart — with the Psychiatric Services of the Central Denmark Region between January 2013 and November 2016, for a total of 398,922 outpatient contacts.

Next, the researchers trained models to predict the 5-year probability of progression to schizophrenia or bipolar disorder separately or together (joint model) the day before a scheduled outpatient contact, using training and external test sets.

First, the researchers evaluated the performance of the joint model and noted it was similarly effective to the individual models when applied to schizophrenia (area under the receiver operating characteristic curve [AUROC]: 0.75 vs. 0.78) and bipolar disorder (AUROC: 0.66 vs. 0.67) in the training set.

When predicting the first occurrence of either disorder, the joint model achieved an AUROC of 0.7 (95% CI, 0.7-0.7) on the training set and 0.64 (95% CI, 0.63-0.65) on the test set.

Using a predicted positive rate of 4%, the researchers found that the joint model had a sensitivity of 9.3%, a specificity of 96.3% and a positive predictive value (PPV) of 13%. The sensitivity of this model was generally greater for predicting schizophrenia (AUROC = 0.74; 95% CI, 0.73-0.75) than bipolar disorder (AUROC = 0.57; 95% CI, 0.56-0.58) on the test set.

Finally, when looking at the individual models, the schizophrenia model performed better than the bipolar model in AUROC (0.8; 95% CI, 0.79-0.81 vs. 0.62; 95% CI, 0.61-0.63), sensitivity (19.4% vs. 9.9%), specificity (96.3% vs. 96.2%) and PPV (10.8% vs. 8.4%). This difference is probably due to the heterogenic clinical manifestations of bipolar disorder compared with schizophrenia, the authors wrote.

Notably, models that included clinical notes in the form of text features seemed to perform better on both the training and test sets. Specifically, the researchers noted the terms “admission,” relating to hospitalization, and “voices,” relating to auditory hallucinations, were particularly impactful.

The researchers noted several limitations to this study, including that the models’ performances were evaluated on data from two hospital sites and lacked information from primary care.

“These findings suggest that detecting progression to schizophrenia through machine learning based on routine clinical data is feasible, which may reduce diagnostic delay and duration of untreated illness,” Hansen and colleagues wrote.

“Models predicting schizophrenia might be more suitable for implementation than those predicting bipolar disorder due to substantially better predictive performance,” they added.