Why human review is key to AI success in healthcare

(Sacramento)

Artificial intelligence (AI) tools are becoming more common in the medical field. It can read medical images, predict risks, and monitor patient conditions from a distance. But AI systems can also make mistakes. This is especially true when the data the AI learns from is not balanced or adequately represents different groups of people.

a new research Led by University of California, Davis professor Courtney Lyles, the study emphasizes the importance of keeping humans updated to review how AI makes decisions to reduce bias and improve safety. This research Social science and medicine.

Professor Courtney Lyles stands on the balcony of Aggie Square and smiles for the camera. She is wearing a white and black sweater. — Professor Courtney Lyles is director of the Center for Health Policy Research at the University of California, Davis.

Lyles is the director Center for Health Policy Research, University of California, Davis. She is also co-founder and co-director. UC SOLVE Health Tech is an initiative involving researchers from the University of California, Davis, the University of California, Berkeley, and the University of California, San Francisco, as well as private digital health companies.

In this Q&A, Lyles answers questions about the use of AI in healthcare and how to detect and prevent bias. She also shared two examples of how UC Davis Health is building fairer and more reliable AI systems to serve patients and physicians.

What is this study about?

The study was a collaboration with researchers at Google, the University of California, and Northeastern University. We used a human-centered approach to critically evaluate explainable AI models and identify areas of bias. We assembled a panel of experts from a variety of fields to find potential sources of bias in AI interpretation.

Why can bias be a problem in AI healthcare systems?

Interpreting AI models requires understanding the social and structural forces that shape health data.

Without this lens, AI systems may produce output that sounds convincing but is incomplete, biased, or unsafe. As AI becomes more integrated into daily clinical care, we can no longer rely solely on algorithms. Combining human expertise with explainable AI tools will be essential.

What is explainable AI? Why is explainable AI important in evaluating AI models?

Explainable AI (XAI) is about understanding why a model made the decisions it did. Uncovering AI behavior provides insight so you can understand how the model arrived at its decisions and predictions.

How do human reviewers assess bias in XAI models?

Our research shows that a panel of experts from several disciplines can scrutinize the output of an XAI model and provide additional context-specific interpretation of whether the results make sense in the real world. In this study, the panel included experts in medicine, epidemiology, behavioral science, engineering, and data science.

The study also recommends including community members and patient advocates. Their lived experiences provide insights that traditional experts may miss and help ensure that AI tools reflect the needs of the communities they serve.

This interdisciplinary framework shows how bringing diverse voices into the process can make AI not only more accurate, but also fairer and more trustworthy.

How will the multidisciplinary panel evaluate the XAI results?

When XAI tools highlight why a model makes certain predictions, patterns often become apparent.

When considering XAI findings, interdisciplinary experts can ask:

Could this pattern be caused by differences in the datasets?
Does this result have anything to do with how patients interact with medical devices?
Does this reflect a social or structural problem rather than a medical problem?

This process helps reveal where the AI may rely on “shortcut functions.” That is, patterns that appear meaningful but actually reflect biases in the data.

How can we translate this XAI research into real-world practice?

Our research includes a case study of how this multidisciplinary panel of experts reviewed real-world XAI results obtained from medical images and recommended clear next steps for research and practice.

By combining technical tools and human judgment, this approach can be used in other cases to improve accuracy and provide more context-based results. In fact, teams can be established in advance to gather the right kind of expertise at the AI decision-making table. This improves implementation and trust between data scientists, clinicians, patients, and communities.

As AI becomes more integrated into daily clinical care, we can no longer rely solely on algorithms. Human expertise and explainable AI tools will be essential. ”—Professor Courtney Lyles, Director, University of California, Davis Center for Health Policy Research

How will public-private partnerships shape AI development in healthcare?

The future will require more intentional public-private partnerships. For example, we founded UC SOLVE Health Tech for that purpose. It focuses on researchers at the University of California who partner with the private sector to improve product equity. Similar to this research, we bring people together in a structured way and foster collaboration between sectors that are often siled.

Industry partners seek practical academic expertise that enables them to take steps towards technology adoption in healthcare.

What are some other examples of how UC Davis Health is using AI?

UC Davis Health is a national leader in implementing AI in many clinical areas.

First, we have great power AI Governance Committee He has been developing and reviewing new AI models for several years. Professor Jason Adams, Director of Data and Analytics Strategy;

Second, our IT leadership team is also a national leader in the fair assessment and deployment of AI at UC Davis Health. For example, a team led by Reshma Gupta, professor and director of Population Health and Responsible Care, has developed a process to reduce bias when developing and implementing AI predictive models. This process uses the model to Identify patients who may be at high risk For readmission to the University of California, Davis. Our population health team uses this model to carefully consider patient subgroups and identify and address potential barriers at each stage of the AI development and dissemination process.

Third, UC Davis Health implemented and evaluated the use of AI Scribe for note-taking in clinical settings.

What is AI Scribe and how does it work?

In 2024, UC Davis Health launched its AI Scribe program, which uses AI to generate notes for clinicians during visits. After patient approval, physicians can begin audio recording of patient interactions. The AI application condensed discussions into a standard clinical note format, saving doctors time tediously transcribing visit details. We conducted a pilot study to evaluate whether an AI writing application is accurate in generating notes.

The results of this study were led by lead biostatistician Sandra Taylor. The Faculty of Public Health Sciences is just Published in Journal of Medical Informatics Research. We found that clinical notes generated by AI were generally of high quality, with 94.7% free of significant errors.

This study also highlights the need for clinicians to continually review the output of AI scribes to find and correct small numbers of errors, reiterating the importance of humans in the loop of these tools.

About the Center for Health Policy Research

The mission of the Center for Health Policy Research is to promote research, promote education, and inform policy regarding health and health care. Its goal is to improve the health of the population by providing new knowledge about access, delivery, costs, quality, and outcomes related to health care, and by providing rigorous evidence to policy makers and other stakeholders. CHPR fulfills its mission through interdisciplinary and collaborative research. Education and career development. Research synthesis and dissemination. learn about CHPR weekly seminar series and Research themes and projects.