Suehnholz, S. P. et al. Quantifying the expanding landscape of clinical actionability for patients with cancer. Cancer Discov. 14, 49–65 (2023).
Google Scholar
Horak, P. & Fröhling, S. Measuring progress in precision oncology. Cancer Discov. 14, 18–19 (2024).
Google Scholar
Chakravarty, D. et al. OncoKB: a precision oncology knowledge base. JCO Precis. Oncol. 1, 1–16 (2017).
Google Scholar
Griffith, M. et al. CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer. Nat. Genet. 49, 170–174 (2017).
Google Scholar
Reardon, B. et al. Integrating molecular profiles into clinical frameworks through the Molecular Oncology Almanac to prospectively guide precision oncology. Nat. Cancer 2, 1102–1112 (2021).
Google Scholar
Luchini, C., Lawlor, R. T., Milella, M. & Scarpa, A. Molecular tumor boards in clinical practice. Trends Cancer 6, 738–744 (2020).
Google Scholar
Gladstone, B. P. et al. Systematic review and meta-analysis of molecular tumor board data on clinical effectiveness and evaluation gaps. NPJ Precis. Oncol. 9, 96 (2025).
Google Scholar
Nichetti, F. et al. Real-world outcomes of molecular tumor board treatment recommendations. JCO Precis. Oncol. 9, e2400387 (2025).
Google Scholar
The AACR Project GENIE Consortium et al. AACR Project GENIE: powering precision medicine through an international consortium. Cancer Discov. 7, 818–831 (2017).
Google Scholar
Pugh, T. J. et al. AACR project GENIE: 100,000 cases and beyond. Cancer Discov. 12, 2044–2057 (2022).
Google Scholar
Wang, S. & Ye, K. Deep-learning based representation and recognition for genome variants — from SNVs to structural variants. Natl Sci. Rev. 11, nwae335 (2024).
Google Scholar
Poplin, R. et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36, 983–987 (2018). This paper, the publication of DeepVariant, brought the proliferation of machine learning to bioinformatics, demonstrating that traditional heuristic and statistical approaches to variant calling could be outperformed.
Google Scholar
Krusche, P. et al. Best practices for benchmarking germline small-variant calls in human genomes. Nat. Biotechnol. 37, 555–560 (2019).
Google Scholar
AlDubayan, S. H. et al. Detection of pathogenic variants with germline genetic testing using deep learning vs standard methods in patients with prostate cancer and melanoma. JAMA 324, 1957–1969 (2020).
Google Scholar
Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37, 561–566 (2019).
Google Scholar
Olson, N. D. et al. PrecisionFDA Truth Challenge V2: calling variants from short and long reads in difficult-to-map regions. Cell Genom. 2, 100129 (2022). This paper illustrates the methodological shift of variant callers towards using machine learning while also highlighting challenge areas for future developers.
Google Scholar
Mandiracioglu, B. et al. ECOLE: learning to call copy number variants on whole exome sequencing data. Nat. Commun. 15, 132 (2024).
Google Scholar
Popic, V. et al. Cue: a deep-learning framework for structural variant discovery and genotyping. Nat. Methods 20, 559–568 (2023).
Google Scholar
Behera, S. et al. Comprehensive genome analysis and variant detection at scale using DRAGEN. Nat. Biotechnol. 43, 1177–1191 (2024).
Google Scholar
Yi, R., Chang, P.-C., Baid, G. & Carroll, A. Learning from data-rich problems: a case study on genetic variant calling. Preprint at https://doi.org/10.48550/arXiv.1911.05151 (2019).
Scheffler, K. et al. Somatic small-variant calling methods in Illumina DRAGENTM Secondary Analysis. Preprint at bioRxiv https://doi.org/10.1101/2023.03.23.534011 (2023).
Park, J. et al. Accurate somatic small variant discovery for multiple sequencing technologies with DeepSomatic. Nat. Biotechnol. https://doi.org/10.1038/s41587-025-02839-x (2025).
Google Scholar
Betschart, R. O. et al. Comparison of calling pipelines for whole genome sequencing: an empirical study demonstrating the importance of mapping and alignment. Sci. Rep. 12, 21502 (2022).
Google Scholar
Roy, S. et al. Standards and guidelines for validating next-generation sequencing bioinformatics pipelines. J. Mol. Diagn. 20, 4–27 (2018).
Google Scholar
van de Haar, J. et al. ESMO recommendations on clinical reporting of genomic test results for solid cancers. Ann. Oncol. 35, 954–967 (2024).
Google Scholar
Eilbeck, K. et al. The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol. 6, R44 (2005).
Google Scholar
den Dunnen, J. T. et al. HGVS recommendations for the description of sequence variants: 2016 update. Hum. Mutat. 37, 564–569 (2016).
Google Scholar
Holmes, J. B., Moyer, E., Phan, L., Maglott, D. & Kattman, B. SPDI: data model for variants and applications at NCBI. Bioinformatics 36, 1902–1907 (2020).
Google Scholar
Wang, M. et al. hgvs: a python package for manipulating sequence variants using HGVS nomenclature: 2018 update. Hum. Mutat. 39, 1803–1813 (2018).
Google Scholar
Lefter, M. et al. Mutalyzer 2: next generation HGVS nomenclature checker. Bioinformatics 37, 2811–2817 (2021).
Google Scholar
van Giffen, B., Herhausen, D. & Fahse, T. Overcoming the pitfalls and perils of algorithms: a classification of machine learning biases and mitigation methods. J. Bus. Res. 144, 93–106 (2022).
Google Scholar
Singh, D. & Singh, B. Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 97, 105524 (2020).
Google Scholar
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 50, D20–D26 (2022).
Google Scholar
Freeman, P. J., Hart, R. K., Gretton, L. J., Brookes, A. J. & Dalgleish, R. VariantValidator: accurate validation, mapping, and formatting of sequence variation descriptions. Hum. Mutat. 39, 61–68 (2018).
Google Scholar
Freeman, P. J. et al. Standardizing variant naming in literature with VariantValidator to increase diagnostic rates. Nat. Genet. 56, 2284–2286 (2024).
Google Scholar
McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).
Google Scholar
Wagner, A. H. et al. The GA4GH Variation Representation Specification: a computational framework for variation representation and federated identification. Cell Genom. 1, 100027 (2021). This paper shows that VRS enables semantically precise, computable variant representation that facilitates further downstream bioinformatic applications and machine learning models.
Google Scholar
Chen, S. et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature 625, 92–100 (2024).
Google Scholar
Arbesfeld, J. A. et al. Mapping MAVE data for use in human genomics applications. Genome Biol. 26, 179 (2025).
Google Scholar
Pagel, K. A. et al. Integrated informatics analysis of cancer-related variants. JCO Clin. Cancer Inform. 4, 310–317 (2020).
Google Scholar
Bruijn, I. et al. Genome Nexus: a comprehensive resource for the annotation and interpretation of genomic variants in cancer. JCO Clin. Cancer Inform. 6, e2100144 (2022).
Google Scholar
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
Google Scholar
Durkie, M. et al. ACGS Best Practice Guidelines for Variant Classification in Rare Disease (ACGS, 2024).
Horak, P. et al. Standards for the classification of pathogenicity of somatic variants in cancer (oncogenicity): joint recommendations of Clinical Genome Resource (ClinGen), Cancer Genomics Consortium (CGC), and Variant Interpretation for Cancer Consortium (VICC). Genet. Med. 24, 986–998 (2022).
Google Scholar
Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.e24 (2019).
Google Scholar
Brandes, N., Goldman, G., Wang, C. H., Ye, C. J. & Ntranos, V. Genome-wide prediction of disease variant effects with a deep protein language model. Nat. Genet. 55, 1512–1522 (2023).
Google Scholar
Cheng, J. et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381, eadg7492 (2023). This paper shows DeepMind’s AlphaMissense and introduces it as a transformative deep learning model for missense variant effect prediction that was rigorously evaluated for its utility within pathogenicity assessments.
Google Scholar
Kurtovic-Kozaric, A. et al. Comprehensive evaluation of AlphaMissense predictions by evidence quantification for variants of uncertain significance. Front. Genet. 15, 1487608 (2024).
Google Scholar
Muiños, F., Martínez-Jiménez, F., Pich, O., Gonzalez-Perez, A. & Lopez-Bigas, N. In silico saturation mutagenesis of cancer genes. Nature 596, 428–432 (2021).
Google Scholar
Demajo, S. et al. Identification of clonal hematopoiesis driver mutations through in silico saturation mutagenesis. Cancer Discov. 14, 1717–1731 (2024).
Google Scholar
Vihinen, M. Problems in variation interpretation guidelines and in their implementation in computational tools. Mol. Genet. Genom. Med. 8, e1206 (2020).
Google Scholar
Fayer, S. et al. Closing the gap: systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53, and PTEN. Am. J. Hum. Genet. 108, 2248–2258 (2021).
Google Scholar
Rubin, A. F. et al. MaveDB 2024: a curated community database with over seven million variant effects from multiplexed functional assays. Genome Biol. 26, 13 (2025).
Google Scholar
Arafeh, R., Shibue, T., Dempster, J. M., Hahn, W. C. & Vazquez, F. The present and future of the cancer dependency map. Nat. Rev. Cancer 25, 59–73 (2025).
Google Scholar
Brixi, G. et al. Genome modeling and design across all domains of life with Evo 2. Preprint at bioRxiv https://doi.org/10.1101/2025.02.18.638918 (2025).
Avsec, Ž. et al. AlphaGenome: advancing regulatory variant effect prediction with a unified DNA sequence model. Preprint at bioRxiv https://doi.org/10.1101/2025.06.25.661532 (2025).
Li, M. M. et al. Standards and guidelines for the interpretation and reporting of sequence variants in cancer. J. Mol. Diagn. 19, 4–23 (2017).
Google Scholar
Mateo, J. et al. A framework to rank genomic alterations as targets for cancer precision medicine: the ESMO Scale for Clinical Actionability of Molecular Targets (ESCAT). Ann. Oncol. 29, 1895–1902 (2018).
Google Scholar
He, M. M. et al. Variant Interpretation for Cancer (VIC): a computational tool for assessing clinical impacts of somatic variants. Genome Med. 11, 53 (2019).
Google Scholar
Li, Q. et al. CancerVar: an artificial intelligence-empowered platform for clinical interpretation of somatic mutations in cancer. Sci. Adv. 8, eabj1624 (2022).
Google Scholar
Ruzicka, J. et al. Clinical evaluation of an AI system for streamlined variant interpretation in genetic testing. Preprint at medRxiv https://doi.org/10.1101/2025.02.04.25321641 (2025).
Lammert, J. et al. Large language models for precision oncology: clinical decision support through expert-guided learning. J. Clin. Oncol. 42, e13609 (2024).
Google Scholar
Klein, H. et al. MatchMiner: an open-source platform for cancer precision medicine. NPJ Precis. Oncol. 6, 69 (2022). The authors introduce a clinical trial matching platform and a structured format for enrolment criteria to facilitate clinical trial matching for precision oncology, addressing a historically intractable problem within the field.
Google Scholar
Lotter, W. et al. Artificial intelligence in oncology: current landscape, challenges, and future directions. Cancer Discov. 14, 711–726 (2024).
Google Scholar
Wong, C. et al. Scaling clinical trial matching using large language models: a case study in oncology. In Proc. 8th Machine Learning for Healthcare Conference 846–862 (PMLR, 2023).
Jin, Q. et al. Matching patients to clinical trials with large language models. Nat. Commun. 15, 9074 (2024).
Google Scholar
Cerami, E. et al. MatchMiner-AI: an open-source solution for cancer clinical trial matching. Preprint at https://doi.org/10.48550/arXiv.2412.17228 (2024).
Reisle, C. et al. Evaluating language models for biomedical fact-checking: a benchmark dataset for cancer variant interpretation verification. Preprint at bioRxiv https://doi.org/10.1101/2025.09.10.675443 (2025).
Lewis, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Information Processing Systems 33, 9459–9474 (Curran Associates, 2020).
Jun, H. et al. Implementing a context-augmented large language model to guide precision cancer medicine. Preprint at medRxiv https://doi.org/10.1101/2025.05.09.25327312 (2025).
Schick, T. et al. Toolformer: language models can teach themselves to use tools. In Advances in Neural Information Processing Systems 36, 68539–68551 (Curran Associates, 2023).
Yao, S. et al. ReAct: synergizing reasoning and acting in language models. Preprint at https://doi.org/10.48550/arXiv.2210.03629 (2023).
Gao, S. et al. TxAgent: an AI agent for therapeutic reasoning across a universe of tools. Preprint at https://doi.org/10.48550/arXiv.2503.10970 (2025).
Ferber, D. et al. Development and validation of an autonomous artificial intelligence agent for clinical decision-making in oncology. Nat. Cancer 6, 1337–1349 (2025). This study is one of the most prominent illustrations of agentic AI systems being applied to precision oncology to support a wide array of clinical decision-making tasks.
Google Scholar
Benary, M. et al. Leveraging large language models for decision support in personalized oncology. JAMA Netw. Open 6, e2343689 (2023).
Google Scholar
Verlingue, L. et al. Artificial intelligence in oncology: ensuring safe and effective integration of language models in clinical practice. Lancet Reg. Health Eur. 46, 101064 (2024).
Google Scholar
Elemento, O., Khozin, S. & Sternberg, C. N. The use of artificial intelligence for cancer therapeutic decision-making. NEJM AI 2, AIra2401164 (2025).
Google Scholar
Deng, J. et al. ImageNet: a large-scale hierarchical image database. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).
Yang, K., Qinami, K., Fei-Fei, L., Deng, J. & Russakovsky, O. Towards fairer datasets: filtering and balancing the distribution of the people subtree in the ImageNet hierarchy. In Proc. 2020 Conference on Fairness, Accountability, and Transparency 547–558 (ACM, 2020).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Google Scholar
Acebedo, A. et al. Collaborating across sectors in service of open science, precision oncology, and patients: an overview of the AACR Project GENIE (Genomics Evidence Neoplasia Information Exchange) Biopharma Collaborative (BPC). ESMO Real World Data Digit. Oncol. 7, 100097 (2025).
Google Scholar
Painter, C. A. et al. The Angiosarcoma Project: enabling genomic and clinical discoveries in a rare cancer through patient-partnered research. Nat. Med. 26, 181–187 (2020).
Google Scholar
Crowdis, J. et al. A patient-driven clinicogenomic partnership for metastatic prostate cancer. Cell Genom. 2, 100169 (2022).
Google Scholar
Lee, E., Jung, S. Y., Hwang, H. J. & Jung, J. Patient-level cancer prediction models from a nationwide patient cohort: model development and validation. JMIR Med. Inform. 9, e29807 (2021).
Google Scholar
Placido, D. et al. A deep learning algorithm to predict risk of pancreatic cancer from disease trajectories. Nat. Med. 29, 1113–1122 (2023).
Google Scholar
Buk Cardoso, L. et al. Machine learning for predicting survival of colorectal cancer patients. Sci. Rep. 13, 8874 (2023).
Google Scholar
Moon, I. et al. Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary. Nat. Med. 29, 2057–2067 (2023).
Google Scholar
Jee, J. et al. Automated real-world data integration improves cancer outcome prediction. Nature 636, 728–736 (2024). This paper shows MSKCC leveraging their data warehouse to develop a machine learning model to predict clinical outcomes, a paradigm that will continue to define clinicogenomic discoveries in the near term.
Google Scholar
Rieke, N. et al. The future of digital health with federated learning. NPJ Digit. Med. 3, 1–7 (2020).
Google Scholar
Sheller, M. J. et al. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Sci. Rep. 10, 12598 (2020).
Google Scholar
Pati, S. et al. Federated learning enables big data for rare cancer boundary detection. Nat. Commun. 13, 7346 (2022).
Google Scholar
Brauneck, A. et al. Federated machine learning in data-protection-compliant research. Nat. Mach. Intell. 5, 2–4 (2023).
Google Scholar
Ogier du Terrail, J. et al. Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer. Nat. Med. 29, 135–146 (2023).
Google Scholar
Stark, Z. et al. A call to action to scale up research and clinical genomic data sharing. Nat. Rev. Genet. 26, 141–147 (2024). This study outline several steps to data sharing and harmonization that can enable clinicogenomic datasets of thousands of patients with cancer, enabling biological discovery and machine learning models that generalize across institutions.
Google Scholar
Fiume, M. et al. Federated discovery and sharing of genomic data using Beacons. Nat. Biotechnol. 37, 220–224 (2019). This study describes the Beacon protocol of GA4GH for federated data sharing, and it has become ubiquitous with federated learning within genomics.
Google Scholar
Elhussein, A., Baymuradov, U., Elhadad, N., Natarajan, K. & Gürsoy, G. A framework for sharing of clinical and genetic data for precision medicine applications. Nat. Med. 30, 3578–3589 (2024).
Google Scholar
Cho, H. et al. Secure and federated genome-wide association studies for biobank-scale datasets. Nat. Genet. 57, 809–814 (2025).
Google Scholar
Hanser, T. et al. Data-driven federated learning in drug discovery with knowledge distillation. Nat. Mach. Intell. 7, 423–436 (2025).
Google Scholar
Riba, M. et al. The 1+Million Genomes Minimal Dataset for Cancer. Nat. Genet. 56, 733–736 (2024).
Google Scholar
Kehl, K. L. et al. Assessment of deep natural language processing in ascertaining oncologic outcomes from radiology reports. JAMA Oncol. 5, 1421–1429 (2019).
Google Scholar
Kehl, K. L. et al. Natural language processing to ascertain cancer outcomes from medical oncologist notes. JCO Clin. Cancer Inform. 4, 680–690 (2020).
Google Scholar
Sushil, M. et al. CORAL: expert-curated oncology reports to advance language model inference. NEJM AI 1, AIdbp2300110 (2024).
Google Scholar
Hoes, L. R. et al. Patients with rare cancers in the Drug Rediscovery Protocol (DRUP) benefit from genomics-guided treatment. Clin. Cancer Res. 28, 1402–1411 (2022).
Google Scholar
Helland, Å et al. Improving public cancer care by implementing precision medicine in Norway: IMPRESS-Norway. J. Transl. Med. 20, 225 (2022).
Google Scholar
Mohammad, S. F. H. et al. The evolution of precision oncology: the ongoing impact of the Drug Rediscovery Protocol (DRUP). Acta Oncol. 63, 34885 (2024).
Nikolski, M. et al. Roadmap for a European cancer data management and precision medicine infrastructure. Nat. Cancer 5, 367–372 (2024).
Google Scholar
Sweeney, S. M. et al. Challenges to using big data in cancer. Cancer Res. 83, 1175–1182 (2023).
Google Scholar
Seligson, N. D. et al. Recommendations for patient similarity classes: results of the AMIA 2019 Workshop on Defining Patient Similarity. J. Am. Med. Inform. Assoc. 27, 1808–1812 (2020). This study provides a conceptual roadmap for the development and implementation of patient similarity approaches within medicine broadly.
Google Scholar
Allam, A., Dittberner, M., Sintsova, A., Brodbeck, D. & Krauthammer, M. Patient similarity analysis with longitudinal health data. Preprint at https://doi.org/10.48550/arXiv.2005.06630 (2020).
Jia, Z., Zeng, X., Duan, H., Lu, X. & Li, H. A patient-similarity-based model for diagnostic prediction. Int. J. Med. Inf. 135, 104073 (2020).
Google Scholar
Navaz, A. N. et al. A novel patient similarity network (PSN) framework based on multi-model deep learning for precision medicine. J. Pers. Med. 12, 768 (2022).
Google Scholar
Wang, N. et al. Sequential data-based patient similarity framework for patient outcome prediction: algorithm development. J. Med. Internet Res. 24, e30720 (2022).
Google Scholar
Savcisens, G. et al. Using sequences of life-events to predict human lives. Nat. Comput. Sci. 4, 43–56 (2023). This study excellently illustrates the power of sequence models to model temporal relationships while maintaining interpretability.
Google Scholar
Manuilova, I. et al. Identifications of similarity metrics for patients with cancer: protocol for a scoping review. JMIR Res. Protoc. 13, e58705 (2024).
Google Scholar
Elmarakeby, H. A. et al. Biologically informed deep neural network for prostate cancer discovery. Nature 598, 348–352 (2021).
Google Scholar
Osipov, A. et al. The molecular twin artificial-intelligence platform integrates multi-omic data to predict outcomes for pancreatic adenocarcinoma patients. Nat. Cancer 5, 299–314 (2024).
Google Scholar
Najgebauer, H. et al. CELLector: genomics-guided selection of cancer in vitro models. Cell Syst. 10, 424–432.e6 (2020).
Google Scholar
Sinha, R., Luna, A., Schultz, N. & Sander, C. A pan-cancer survey of cell line tumor similarity by feature-weighted molecular profiles. Cell Rep. Methods 1, 100039 (2021).
Google Scholar
Zhao, Y. et al. CUP-AI-Dx: a tool for inferring cancer tissue of origin and molecular subtype using RNA gene-expression data and artificial intelligence. EBioMedicine 61, 103030 (2020).
Google Scholar
Vibert, J. et al. Identification of tissue of origin and guided therapeutic applications in cancers of unknown primary using deep learning and RNA sequencing (TransCUPtomics). J. Mol. Diagn. 23, 1380–1392 (2021).
Google Scholar
Darmofal, M. et al. Deep-learning model for tumor-type prediction using targeted clinical genomic sequencing data. Cancer Discov. 14, 1064–1081 (2024).
Google Scholar
Bick, A. G. et al. Genomic data in the All of Us Research Program. Nature 627, 340–346 (2024).
Google Scholar
Subhashini, R. & Kumar, V. J. S. Evaluating the performance of similarity measures used in document clustering and information retrieval. In Proc. First International Conference on Integrated Intelligent Computing 27–31 (IEEE, 2010).
Parimbelli, E., Marini, S., Sacchi, L. & Bellazzi, R. Patient similarity for precision medicine: a systematic review. J. Biomed. Inform. 83, 87–96 (2018).
Google Scholar
Cross, J. L., Choma, M. A. & Onofrey, J. A. Bias in medical AI: implications for clinical decision-making. PLoS Digit. Health 3, e0000651 (2024). This study outlines several biases that must be considered for successful AI applications within medicine broadly, especially model developers.
Google Scholar
Collins, G. S. et al. Evaluation of clinical prediction models (part 1): from development to external validation. BMJ 384, e074819 (2024).
Google Scholar
Hantel, A. et al. Perspectives of oncologists on the ethical implications of using artificial intelligence for cancer care. JAMA Netw. Open 7, e244077 (2024).
Google Scholar
Dai, L., Zhu, H. & Liu, D. Patient similarity: methods and applications. Preprint at https://doi.org/10.48550/arXiv.2012.01976 (2020).
Aldrighetti, C. M., Niemierko, A., Van Allen, E., Willers, H. & Kamran, S. C. Racial and ethnic disparities among participants in precision oncology clinical studies. JAMA Netw. Open 4, e2133205 (2021).
Google Scholar
Kamran, S. C. et al. Tumor mutations across racial groups in a real-world data registry. JCO Precis. Oncol. 5, 1654–1658 (2021).
Google Scholar
Cheung, A. T. M. et al. Racial and ethnic disparities in a real-world precision oncology data registry. NPJ Precis. Oncol. 7, 1–6 (2023).
Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).
Google Scholar
Kehl, K. L. et al. Shareable artificial intelligence to extract cancer outcomes from electronic health records for precision oncology research. Nat. Commun. 15, 1–11 (2024).
Google Scholar
Ehrmann, D. E., Joshi, S., Goodfellow, S. D., Mazwi, M. L. & Eytan, D. Making machine learning matter to clinicians: model actionability in medical decision-making. NPJ Digit. Med. 6, 1–5 (2023).
Google Scholar
Vaccaro, M., Almaatouq, A. & Malone, T. When combinations of humans and AI are useful: a systematic review and meta-analysis. Nat. Hum. Behav. 8, 2293–2303 (2024).
Google Scholar
Riley, R. D. et al. Evaluation of clinical prediction models (part 2): how to undertake an external validation study. BMJ 384, e074820 (2024).
Google Scholar
Riley, R. D. et al. Evaluation of clinical prediction models (part 3): calculating the sample size required for an external validation study. BMJ 384, e074821 (2024).
Google Scholar
la Roi-Teeuw, H. M. et al. Don’t be misled: 3 misconceptions about external validation of clinical prediction models. J. Clin. Epidemiol. 172, 111387 (2024).
Google Scholar
Petersen, C. et al. Recommendations for the safe, effective use of adaptive CDS in the US healthcare system: an AMIA position paper. J. Am. Med. Inform. Assoc. 28, 677–684 (2021).
Google Scholar
Ong, J. C. L. et al. Medical ethics of large language models in medicine. NEJM AI 1, AIra2400038 (2024).
Google Scholar
Ghassemi, M., Oakden-Rayner, L. & Beam, A. L. The false hope of current approaches to explainable artificial intelligence in health care. Lancet Digit. Health 3, e745–e750 (2021). This critical review encourages model developers to focus on model validation instead of interpretability.
Google Scholar
Gilbert, S. & Kather, J. N. Guardrails for the use of generalist AI in cancer care. Nat. Rev. Cancer 24, 357–358 (2024).
Google Scholar
Zhou, L. et al. Larger and more instructable language models become less reliable. Nature 634, 61–68 (2024).
Google Scholar
Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).
Google Scholar
Lipkova, J. & Kather, J. N. The age of foundation models. Nat. Rev. Clin. Oncol. 21, 769–770 (2024).
Google Scholar
Okun, S. A., Lu, D., Sew, K., Subramaniam, A. & Lockwood, W. W. MET activation in lung cancer and response to targeted therapies. Cancers 17, 281 (2025).
Google Scholar
Rodon, J. et al. Genomic and transcriptomic profiling expands precision cancer medicine: the WINTHER trial. Nat. Med. 25, 751–758 (2019).
Google Scholar
Vaske, O. M. et al. Comparative tumor RNA sequencing analysis for difficult-to-treat pediatric and young adult patients with cancer. JAMA Netw. Open 2, e1913968 (2019).
Google Scholar
Wong, M. et al. Whole genome, transcriptome and methylome profiling enhances actionable target discovery in high-risk pediatric cancer. Nat. Med. 26, 1742–1753 (2020).
Google Scholar
Yates, J. & Van Allen, E. M. New horizons at the interface of artificial intelligence and translational cancer research. Cancer Cell 43, 708–727 (2025).
Google Scholar
Rehm, H. L. et al. GA4GH: international policies and standards for data sharing across genomic research and healthcare. Cell Genom. 1, 100029 (2021).
Google Scholar
Shick, A. A. et al. Transparency of artificial intelligence/machine learning-enabled medical devices. NPJ Digit. Med. 7, 1–4 (2024).
Google Scholar
Bonneville, R. et al. Landscape of microsatellite instability across 39 cancer types. JCO Precis. Oncol. 1, 1–15 (2017).
Google Scholar
Nguyen, L. et al. Pan-cancer landscape of homologous recombination deficiency. Nat. Commun. 11, 5584 (2020).
Google Scholar
Jia, P. et al. MSIsensor-pro: fast, accurate, and matched-normal-sample-free detection of microsatellite instability. Genom. Proteom. Bioinform. 18, 65–71 (2020).
Google Scholar
Niu, B. et al. MSIsensor: microsatellite instability detection using paired tumor-normal sequence data. Bioinformatics 30, 1015–1016 (2014).
Google Scholar
Ziegler, J. et al. A deep multiple instance learning framework improves microsatellite instability detection from tumor next generation sequencing. Nat. Commun. 16, 136 (2025). This paper presents a deep learning model that increases performance of MSI detection relative to status quo bioinformatic tools while also enabling tissue conservation.
Google Scholar
Sztupinszki, Z. et al. Migrating the SNP array-based homologous recombination deficiency measures to next generation sequencing data of breast cancer. NPJ Breast Cancer 4, 1–4 (2018).
Google Scholar
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Google Scholar
Rosenthal, R., McGranahan, N., Herrero, J., Taylor, B. S. & Swanton, C. deconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol. 17, 31 (2016).
Google Scholar
Díaz-Gay, M. et al. Assigning mutational signatures to individual samples and individual somatic mutations with SigProfilerAssignment. Bioinformatics 39, btad756 (2023).
Google Scholar
Gulhan, D. C., Lee, J. J.-K., Melloni, G. E. M., Cortés-Ciriano, I. & Park, P. J. Detecting the mutational signature of homologous recombination deficiency in clinical samples. Nat. Genet. 51, 912–919 (2019).
Google Scholar
Laprovitera, N. et al. Cancer of unknown primary: challenges and progress in clinical management. Cancers 13, 451 (2021).
Google Scholar
Belenkaya, R. et al. Extending the OMOP common data model and standardized vocabularies to support observational cancer research. JCO Clin. Cancer Inform. 5, 12–20 (2021).
Google Scholar
