AI tools in hematology: With thorough assessment, ‘informed decisions’ are possible
Click Here to Manage Email Alerts
The emerging use of artificial intelligence in health care presents a new set of ethical issues for clinicians to consider.
A special interest session at ASH Annual Meeting and Exposition addressed the new possibilities and ethical dilemmas around AI in diagnosis, drug development, clinical trials and more.
In one of the session’s four talks, Yates Coley, PhD, associate biostatistics investigator at Kaiser Permanente Washington Health Research Institute and affiliate assistant professor in the University of Washington's department of biostatistics, offered guidance on evaluating AI clinical prediction models and other AI tools to best ensure equitable access to care.
“It’s common for folks to be daunted by buzzwords like ‘artificial intelligence,’ ‘cognitive computing’ and ‘advanced analytics’ — these words can be used to intimidate and deter criticism from people who are outside of the statistical or computer science world,” Coley said. “However, it is possible to make informed decisions about using clinical prediction models and other AI tools in practice without having advanced training in computer science or statistics. Even if you don’t understand everything going on under the hood, you can evaluate how well it does in your population.”
Asking the right questions
Although decision making about using AI tools may seem complex, clinicians can ask some basic questions to guide their evaluation of these tools. Clinicians should first consider the appropriateness of the outcome that’s used to train the model, and specifically whether that outcome provided accurate measurement for all patients, Coley said.
They cited research published in Science by Obermeyer and colleagues in 2019 that found an algorithm being used to guide referrals to advanced care management for patients with complex health needs had been trained on health care expenditures.
“Because our health care system is one in which more money is spent on care for white patients than for patients of color, this training underestimated the needs of patients of color,” Coley said. “The system was then being used to disproportionately funnel more resources to the white population. This is an example of where clinicians can ask about the outcome an AI tool was trained on, and whether it’s an outcome we have confidence in.”
Another important factor to consider is the need to rigorously validate prediction models in a population that is representative of the population the tool will be used in. Coley said a thorough, independent validation of a prediction model can alleviate a good deal of concern.
“Even if you don’t understand all the workings of some black-box model, the proof is in the performance,” they said. “We can actually see how well a model does in our population, along with how well it does in subgroups for whom we might be worried about worse outcomes or exacerbating inequities.”
Coley referenced a paper they published in 2021 in JAMA Psychiatry, in which their group audited the performance of two prediction models for suicide risk across various racial and ethnic groups. They assessed whether there were disparities in performance that would lead to disparate impacts of using a particular AI model.
They found that these models — designed to predict risk for death from suicide in the 90 days following outpatient mental health visits — performed poorly for patients who were Black, American Indian and Alaskan native, as well as those for whom no race or ethnicity information had been recorded.
“We need to consider the downstream clinical impact of using an AI prediction model, not just the performance of the model,” Coley said.
The role of race, ethnicity in prediction models
Coley also discussed the benefits and drawbacks of including race and ethnicity data in clinical prediction models.
“I’m not going to provide a simple answer to this — there’s no ‘yes or no; always use it, or always don’t,’” they said. “There are examples of situations in which inclusion of race and ethnicity in clinical algorithms was based on a racist assumption about biological differences between races and led to disparities in care or care being withheld from people in need. However, there are also examples of cases where, if we use race and ethnicity in our models as social determinants of health, knowing this characteristic reflects differential access to quality care, then it can improve the performance of our models.”
Because of this variability, Coley urged practices to assess the use of race and ethnicity in these models in terms of the specific population for which the models are intended.
“We just need to validate how well a model is going to do in our population to know whether including race or ethnicity is going to be driving the outcomes and the equity we’re trying to achieve.”
Patient privacy
Coley also touched upon the issues of privacy and data confidentiality. They noted that many AI-driven tools ask clinicians to input data that is then owned by the company that sells the tool.
“We have to be very careful with patient data, particularly when we’re thinking about genetic data,” they said. “I know there’s a lot of interest in using AI tools that use genetic data in hematology. Patient privacy and data confidentiality can become very complicated issues.”
Coley discouraged hematologists and practices from assuming that AI tools will accurately and equitably represent the needs of all patients. They noted that AI tools trained on real-world data will very likely reflect any existing inequities and structural barriers faced by disenfranchised communities.
“If we take our existing data as it’s observed in the real world and presume that it reflects clinical needs, we’ll be underestimating the needs of some communities,” Coley said.
References:
- Coley Y. AI ethics: more common than what you think. Presented at: ASH Annual Meeting and Exposition, Dec. 9-12, 2023; San Diego.
- Coley RY, et al. JAMA Psych. 2021;doi:10.1001/jamapsychiatry.2021.0493.
- Obermeyer Z, et al. Science. 2019;doi:10.1126/science.aax2342.