AI models find hidden breast cancer years before diagnosis with routine mammograms.

AI News


A large-scale NHS screening study shows that artificial intelligence can detect subtle signals in ‘normal’ mammograms and reveal which women are most likely to develop malignant interstage cancer, years before the onset of the disease.

Study: Performance of breast cancer risk prediction algorithms across mammography systems in the UK screening program. Image credit: CameraCraft / Shutterstock

Study: Performance of breast cancer risk prediction algorithms across mammography systems in the UK screening program. Image credit: CameraCraft / Shutterstock

In a recent study published in the journal npj digital medicineresearchers have developed four state-of-the-art deep learning techniques (DL) Algorithm for predicting “interval cancer”. These cancers account for approximately 30% of cancers diagnosed between a negative mammogram and the next scheduled screening test in a screening program and represent a significant diagnostic gap in current mammogram-based screening approaches.

The results of the study revealed something academic. DL Model Mirai (developed by MIT) is the best performing model (interval cancer) AUC = 0.77). The model identified approximately 27.5% of interval cancers in the study cohort by flagging the top 4% of “normal” (negative) screening mammogram images as highest risk.

Although this study notes that model performance varies slightly between the specific machines used to generate mammogram images, and that one algorithm shows statistically significant differences between systems, these results suggest that: DL The tool could potentially support risk-stratified breast cancer screening strategies, but prospective clinical evaluation will be required before implementation.

Background: The challenge of interval breast cancer

For decades, breast cancer screening recommendations have recommended that women have a mammogram every few years (eg every three years in the UK). [UK]). However, growing evidence suggests that while these routine screenings are necessary and effective in detecting most breast cancers, they fail to identify “interval cancers” that are diagnosed after a negative screening mammogram but before the next scheduled screening.

These “hidden” cancers, which are observed to occur in between screening schedules or become clinically apparent, are often significantly more aggressive than cancers detected by routine mammography, leading to poor prognosis and clinical outcomes, including death.

Traditional approaches to addressing interval cancers have involved clinicians attempting to predict individual risk through genetic assessments (such as polygenic risk scores, which are not routinely performed in most population screening programs) and family history assessments (often incomplete).

However, recent advances in deep learning (DL) algorithms allow researchers to develop these artificial intelligence (A.I.) The model has been trained on millions of mammogram images and has the potential to recognize subtle image patterns and tissue characteristics in breast tissue that human radiologists may miss.

Unfortunately, given the commercial and academic wealth; DL Although there are currently available models, clinicians do not yet know which model to choose or whether these tools can perform well enough to be incorporated into personalized treatments.

Research objectives and model comparison

This study aimed to address this knowledge gap by directly comparing the performance of four of today’s most advanced breast cancer predictions. DL Model: Mirai (MIT), iCAD ProFound A.I. Risk (commercial model), Transpara Risk (another commercial model) DL Tools), Google Health Risk Model.

Validation dataset from the UK NHS Screening Program

These models include EnglandNational Health Service (NHS). This dataset consisted of high-resolution “negative” (cancer-free) screening mammograms (n = 112,621) collected from 2 different institutions between 2014 and 2017. NHS Screening venue.

The performance of the model was tested by following participants for five years and observing which women eventually developed breast cancer (approximately 1,225 cases over the entire follow-up period), including interval cancers.

Evaluation across mammography hardware platforms

To assess the generalizability of the algorithm’s performance across different mammography hardware platforms, DL The model was trained on mammography images from a variety of hardware ecosystems, particularly Philips and GE machines.

Predictive performance of deep learning models

The study results revealed that the academic algorithm Mirai consistently showed the highest predictive power (area under the curve). [AUC] = 0.72; p < 0.001). iCAD (AUC = 0.70), Google (AUC = 0.68), and transpara (AUC = 0.65) achieved a lower score, but the predictive performance is still remarkable considering that the input mammograms were previously interpreted as “normal” during routine screening.

Identifying high-risk patients with interval cancer

Research observations have shown that these models can identify future interval cancers from screening tests that are initially interpreted as negative (Mirai Interval Cancers) AUC = 0.77). When the researchers tested the top 4% of women identified by Mirai as “highest risk,” approximately 27.5% of all interval cancers in the cohort occurred within this high-risk group during follow-up.

We observed that expanding this high-risk group to the top 14% of women doubled the interval cancer detection rate, capturing approximately 50.3% of all future interval cancers in the cohort.

Overall performance of mammography equipment manufacturers

The study also evaluated whether algorithm performance differed between mammography machine manufacturers. The researchers found that three of the four evaluation models performed statistically similarly on images produced by Philips and GE machines. The Transpara model performed better on images generated on GE machines than on images generated on Philips machines, but the difference was relatively small (AUC = 0.69 vs. 0.62).

The researchers also highlighted several limitations, including the exclusion of mammograms with implants or nonstandard image views, incomplete ethnicity data, and the possibility that the results may not be fully generalizable to mammography systems from other major vendors. The authors also note that retrospective validation may underestimate potential clinical utility, as some cancers may be detected through additional imaging pathways beyond symptom onset alone.

Conclusion: Towards risk-stratified breast cancer screening

Current research provides evidence to suggest that: DL The model can identify previously unrecognized image signals from a standard mammogram and predict future cancer risk. Models such as MIT’s Mirai have been shown to identify and alert on a significant proportion of interval cancers in a small group of high-risk women.

Future research should aim to investigate these results in prospective clinical trials or real-world screening settings before integrating such tools into individualized screening protocols.



Source link