artificial intelligence (AI) The model, pre-processed with a vast dataset, surpassed the standard baseline model for identifying non-hepatoma skin cancer from digital images of tissue samples, based on a session presented at the 2025 American Cancer Research Association (AACR) Annual Meeting.1 It's also published Cancer Epidemiology, Biomarkers, and Prevention.2 In fact, the study authors believe that these pre-processed machine learning models could expand the scope of AI-based cancer diagnosis to resource-limited settings such as Bangladesh. This work was carried out in collaboration with the Institute for Population and Accuracy Health at the University of Chicago.
Stephen Song, MS, BS
“Our research proposes a basic model as a resource-efficient tool to assist in the diagnosis of skin cancer in non-melanomas, but acknowledges that it is far from directly impacting patient care and that further work is needed to address practical considerations such as digital pathology infrastructure, internet connectivity and integration into clinical workflows. Stephen Song, MS, BSMD/PhD candidate, University of Chicago, Pritzker School of Medicine.
Find out more and important results
In this study, the researchers evaluated three modern pathology foundation models: Prism (from Page AI), Uni (from Brigham and Female Mahmoo Lab), and Probugigapath (from Microsoft).
The accuracy of these models in the diagnosis of non-hepatoma skin cancer was assessed on full-length images stained with 2,130 hematoxylin and eosin, representing 553 biopsy samples of 455 Bangladeshi individuals enrolled in the Bangladeshi Vitamin E and Selenium Test. High levels of exposure to arsenic with contaminated drinking water increases the risk of non-melanoma skin cancer in this population and provides real-world context related to the study, Song said. Of these biopsy samples, 41% had benign, 31% had Bowen's disease (also known as squamous cell carcinoma), 21% had basal cell carcinoma, and 7% had invasive squamous cell carcinoma.
“We take images of the slide, tile them into small image patches, use a tile encoder to encode those tiles, revert the tiles to slide-level representation, and final classification on slide-level embedding,” Song explained. For slide aggregation, I added that attention-based deep multi-instance learning was used for the best combination of UNI and Prov-Gigapath models, and multilayered Perceptron was used for classification of all three models.
“The AI-based foundation model is a powerful functional extractor that can easily be adapted to the task of accurately diagnosing non-melanoma skin cancer, representing a potential tool to assist pathologists working in resource-limited settings.”
– Stephen Song, MS, BS
Tweet this quote
The accuracy of the three basic models was compared to that of ResNet18, an older architecture for image recognition. “The ResNet architecture has been used as a starting point for training vision models for nearly a decade and serves as a meaningful baseline comparison to assess the performance benefits of the new assumption-based models,” Song said.
According to the study authors, all three basic models were significantly outperforming ResNet18. Due to the lack of additional training of the basic model, the derived classifier is correctly diagnosed as a subtype of non-hepatoma skin cancer, with areas under the receiver operating characteristic curve (AUROC) of 0.925 (PRISM), 0.913 (UNI), and 0.908 (Prov-Gigapath), all of which are significantly more external Resnet18's 0.805 (p <.001).
“All in all, with each of these basic models, basal cell carcinoma is relatively easy,” Son said. “Bowen's disease is relatively stiff in three models. Furthermore, Uni and Prov-Gigapath struggle to distinguish invasive squamous cell carcinoma. However, the prism actually restores some of its performance in distinguishing invasive squamous cell carcinoma from Bowen's disease.”
Song provided a possible explanation for the better performance of the Prism in this task. “The Prisms are trained for more slide skin images and appear to have pan-slide attention, which is key to distinguishing invasive squamous cell carcinoma and in situ squamous cell carcinoma.”
Used in resource limit settings
Song and colleagues believe that these basic models “a powerful feature extractor that can be easily adapted to the task of accurately diagnosing non-melanoma skin cancer…and represent potential tools to assist pathologists working in resource-limited settings.” We developed and tested simplified versions of each model to make them available in resource limit settings. These models required computational analysis without elongation of pathological image data and still outperformed ResNet18 using Aurocs of 0.882 (PRISM), 0.865 (UNI), and 0.855 (Prov-Gigapath).
Additionally, we developed and evaluated an automated slide annotation tool that does not require model training and may provide additional utility in resource-limited triage settings. “In general, there is a consensus between model-derived annotations for these slides and manual annotations by expert pathologists,” Song said.
Investigators acknowledged some limitations on research. First, the model was evaluated in a single cohort of patients in Bangladeshi. This may limit the generalizability of the findings to other populations. Second, this study approached model design in terms of resource limit settings, but did not examine the practical details of deploying prerequisite machine learning models with such resource limit settings.
Disclosure: This study was supported by the National Institutes of Health. Mr. Uta No conflicts of interest reported.
reference
1. EllisS, et al: 2025 AACR Annual Meeting. Summary 1141. Published on April 27, 2025.
2. EllisS, et al: Before cancer epidemiol biomarkers. April 27, 2025 (early online release).
