Artificial intelligence has the potential to make cancer diagnosis safer and fairer by learning when to defer to human pathologists, without burdening them, according to researchers at the University of Surrey and Monash University.
This approach addresses two critical issues that limit the use of AI-assisted decision-making in cancer pathology, radiology, and other fields where human expertise remains essential. Current human-AI collaborative systems require all experts to review each case during training, an expensive and time-consuming process. They also tend to overwork the most accurate experts during testing, putting them at risk for burnout and errors.
This research introduces a probabilistic method that allows an AI system to learn from imperfect expert input while distributing the workload evenly across teams.
The research team tested their approach to pathology imaging of colon cancer, with three expert pathologists classifying tissue samples into normal, precancerous, and cancerous categories. Even when 70% of expert annotations were missing during training, the system maintained high accuracy without overwhelming a single pathologist with cases.
Professor Gustavo Carneiro from the University of Surrey’s Center for Visual, Speech and Signal Processing, co-author of the study, said:
“In cancer pathology and radiology, we know that overloading professionals leads to errors. There are documented cases of radiologists misdiagnosing 162 cases in a day, when the average is only 50. AI prevents this by ensuring that work is distributed fairly while maintaining high accuracy. The AI learns to handle routine cases independently and delegate complex cases to humans, but importantly, not always to the same person.”
This challenge is particularly acute in cancer diagnosis, where expert judgment is required to distinguish between benign, precancerous, and malignant tissue, but pathologists are faced with an increasing number of cases. An AI system that can confidently handle simple cases while flagging complex cases for human review could reduce pressure on experts without compromising diagnostic accuracy.
Lead author and researcher Dr Cuong Nguyen from Surrey’s Center for Visual, Speech and Signal Processing said:
“Previous systems assumed that you could have all experts review all training samples, but for large datasets and busy clinical teams, this is simply not practical. We showed that it is possible to train an effective Human-AI system even when experts review only a portion of the data. This makes this technology much more practical for real-world deployment in cancer pathology and other high-risk medical fields.”
The system uses an algorithm that treats both the choice of which experts to consult and the opinions of the missing experts as variables that can be inferred during training. It also includes mechanisms to control the amount of work allocated to each expert and the AI classifier itself, allowing organizations to set workload limits during training rather than adjusting them after training.
The study addresses growing concerns about the introduction of AI in healthcare, where purely automated systems may miss important details, while consulting humans on every decision is impractical and costly. The team also tested the approach for chest X-ray interpretation and bone disease imaging, demonstrating its versatility across a variety of medical imaging tasks.
This research was presented at the 2025 International Conference on Learning Representations (ICLR).
