Read more

August 11, 2022
3 min read
Save

Use of EMR data alone may lead to overestimates of survival time of patients with cancer

You've successfully added to your alerts. You will receive an email when new content is published.

Click Here to Manage Email Alerts

We were unable to process your request. Please try again later. If you continue to have this issue please contact customerservice@slackinc.com.

Key takeaways:

  • The cancer registry included more than twice as many deaths as the EMR.
  • Median OS from the date of diagnosis was 58.7 months using EMR data compared with 20.8 months using EMR data combined with cancer registry data.
  • Survival estimates in studies that use the EMR as the only data source may be misleading.

Use of electronic medical record data alone led to overestimates of OS compared with survival curves generated using combined EMR and cancer registry data, according to study results published in JCO Clinical Cancer Informatics.

Rationale and methods

Use of real-world data enables researchers to observe outcomes of patients not treated as part of clinical trials, but rather in routine practice, Michael F. Gensheimer, MD, clinical associate professor in the department of radiation oncology at Stanford University School of Medicine, told Healio.

Median OS (In months) from date of cancer diagnosis
Michael Gensheimer
Michael F. Gensheimer

“Pharmaceutical companies and others are excited about real-world data and real-world evidence because they can potentially help us learn how treatments work in situations not well represented in clinical trials, such as for patients with rare tumor mutations or elderly patients,” Gensheimer said. “EMR data has huge advantages for research. For example, it has rich detail about patients’ comorbidities and symptoms that might not be available anywhere else.”

However, there are many pitfalls with using EMR data for research, he added.

“One of those is incomplete data,” Gensheimer said. “In my clinical research, I noticed that patients who were known to be deceased were often not marked as deceased in the EMR. If using EMR data for real-world evidence generation, one might think that the survival rate from a treatment was better than it actually was. This got me wondering how big of an issue this is and whether it would affect the results of real-world data studies that have survival endpoints.”

The retrospective study included 4,077 patients who had been diagnosed with metastatic cancer or received initial treatment between 2008 and 2014. Gensheimer and colleagues calculated OS using the Kaplan-Meier method and generated survival curves with the use of EMR follow-up data alone or with EMR data supplemented with data from Stanford Cancer Registry/California Cancer Registry.

Median follow-up for data from the EMR and Stanford Cancer Registry/California Cancer Registry was 19.9 months, and median follow-up among the surviving patient population was 67.6 months.

Key findings

Researchers found 1,301 deaths recorded in the EMR compared with 3,140 deaths recorded in the cancer registry.

They reported median OS from the date of diagnosis of 58.7 months (95% CI, 54.2-63.2) with EMR data compared with 20.8 months (95% CI, 19.6-22.3) with EMR data combined with cancer registry data.

“We are lucky at our medical center to have EMR data for tens of thousands of cancer patients available for research, and also access to gold-standard survival data through the state cancer registry,” Gensheimer said. “We found that the quality of the EMR data was worse than ever expected, with only 41% of deaths recorded in the EMR. We also found that sicker patients tended to be lost to follow-up, leading to informative censoring, which violates a key assumption of the Kaplan-Meier method. This caused median OS to be overestimated by over a factor of two, for a median of 5 years vs. the reality of less than 2 years.”

Implications

The findings from this study cast doubt on survival estimates from many retrospective institutional studies published in oncology journals, Gensheimer said.

“If authors state that their patients’ survival time was better than that of comparable patients from SEER or other sources, it could be from improved treatment efficacy, but also other possible reasons, such as data quality issues,” he added. “Hopefully this study will raise awareness of this issue, which is the first step in improving the situation. We recommend that researchers try to supplement their EMR survival data using other sources, like state cancer registries or the National Death Index. Unfortunately, the Social Security Death Master File, which was previously very useful for this, is no longer complete and is not sufficient. This is not an easy problem to solve.”

For more information:

Michael F. Gensheimer, MD, can be reached at mgens@stanford.edu.