How AI and machine learning are transforming

Machine Learning


AI is revolutionizing the discovery of drug discoveries with the ability to analyze vast datasets. Particularly beneficial in the early drug discovery stage, AI helps identify potential drug targets more effectively than traditional methods, accelerating the early stages of drug development by rapidly sifting through large volumes of data.

According to a GlobalData survey published in April 2025, AI is considered the most disruptive technology of today's companies, including the healthcare industry, with 73% of respondents claiming that AI will significantly or slightly disrupt the industry.[i]. In fact, in 2024, the Nobel Prize in Chemistry was awarded to the Deepmind team for their work at Alphafold.[ii]an AI system that accurately predicts protein structure. Heiner Linke, awarded for “Computational Protein Design” and “Protein Structure Prediction,” was chair of the Nobel Committee on Chemistry, saying, “Predicting protein structures from amino acid sequences opens enormous possibilities.”

Once the target is identified, AI continues to play an important role in lead generation and optimization. AI models can help predict molecular interactions and design new compounds for specific therapeutic goals. For example, the generation AI is used to create small molecules or proteins that meet certain criteria and to enhance the drug design process.[iii].

The role of AI has also expanded to clinical trials, and AI is already being used to enhance trial design, feasibility and site selection, along with data analysis and regulatory submission and review. In 2023, a study found that clinical trial patient matching tools could reduce physician check times by 90% by leveraging large-scale language models (LLMS).[iv].

This integration of AI into the drug discovery process could significantly reduce costs by improving efficiency. Some experts suggest that if the cost of reaching Phase I readout exceeds USD 100 million, it could drop to around $70 million[v]Condensing the typical 4- to 5-year exploratory research stages into less than one year, while making the pursuit of innovative treatments more cost-effective, significantly improving indications with fewer treatment options.[vi].

However, the full potential of AI is constrained by challenges such as the need for high-quality datasets, regulatory hurdles, and even a lack of human expertise. Regulators respond by establishing guidelines and encouraging collaboration with industry stakeholders, but challenges remain particularly regarding the ethical implications of AI technology in healthcare. The widespread adoption is also hampered by the biological data required to train AI models. This is often time consuming and time-consuming to produce.

Early stage drug discovery

Machine learning (ML) models significantly improve the efficiency of early stage drug discovery, especially in hit identification and lead optimization, when compared to traditional approaches. One of the most notable benefits of ML is its ability to quickly process huge amounts of data. In traditional drug discovery, identifying a single candidate can take years and requires the synthesis and testing of thousands of molecules over the long term. Applications for companies that GlobalData has verified[vii]which shows that the traditional method requires synthesis of approximately 5,000 molecules over four to six years to identify promising candidates.

In contrast, ML models can effectively screen billions of molecules and reduce the number of physical tests to just a few hundred. This feature allows researchers to prioritize 1 billion molecules in about a day, compared to 1 million days in traditional ways. GlobalData's drug database shows that there are 40 regenerative medicine therapies currently discovered or developed using AI[viii]. Of these, three have undergone phase II clinical trials involving Aspen Neuroscience's ANPD-001. This currently belongs to a phase II human clinical trial for the treatment of Parkinson's disease. AI and ML play an important role in this process as they test cells to ensure proper functioning, including ML-based genetic testing to assess cell quality.

ML models can also employ active learning techniques. This allows for more accurate prediction of molecular properties. Prioritizing molecules in physics-based analyses allows researchers to capture potential problems early in the drug discovery process, further increasing efficiency and reducing costs.

Integrating generated AI and data analysis with data interrogation tools significantly improves the efficiency and efficacy of drug discovery processes by promoting compound and target screening, promoting biological activities, safety assessments, and identification of new uses for existing drugs.

AI-driven screening methods such as high-throughput screening (HTS) allow researchers to quickly evaluate large libraries of compounds. This greatly accelerates the identification of lead compounds by examining multiple candidates simultaneously, but tools for virtual screening, structure-based drug design, and ligand-based drug design promote the identification of potential drug candidates by predicting how compounds interact with biological targets.

Tools that compile a wide range of data from a variety of databases allow researchers to more effectively assess the safety profile of compounds. By analyzing historical data on toxicity and pharmacokinetics, researchers can identify potential adverse effects and optimize drug candidates accordingly. By leveraging existing data on previously tested drug candidates, researchers can explore new indications. This dual approach to “saving” historic candidates while fostering new candidates minimizes risk and increases the likelihood of successful clinical trials.

AI model training

Drug discovery training AI models rely heavily on a wide range of biological and clinical data, with an emphasis on diversity to mitigate bias. Addressing data gaps and bias requires a combination of innovative technologies, open data practices, and regulatory oversight. The most important data types for training AI models in drug discovery include biological and clinical trial data, digital biomarkers, and diverse demographic data.

Biological data is essential for understanding biological processes and drug interactions, but generating such data is slow and resource-intensive. Clinical trial data including patient demographics, treatment response and outcomes are essential for predicting treatment response and developing effective drugs. Digital biomarkers derived from digital behavioral interventions and patient surveillance systems provide insight into treatment response and enhance the predictive capabilities of AI models.

Addressing data gaps and biases in AI training is essential for drug development. Strategies include data augmentation, open source data sharing, fine-tuning techniques, in-loop approaches, and regulatory frameworks. Data augmentation increases the size of the training dataset by creating variations of existing data, but open source data sharing allows for broader access to a wider variety of data sources. While fine-tuning techniques prioritize the use of representative training data to correct bias, the loop-in-the-loop approach involves expert feedback during the AI ​​training process to identify and correct bias.

How Electronic Health Records (EHRs) Provide Key Insights

Veradigm, a leading provider of healthcare data and technology solutions, uses AI to analyze identified data in electronic health records (EHRs) within the Veradigm network to capture structured data across diverse patient populations and regions. Veradigm's AI-driven approach also allows for scalable extraction of information from unstructured data, providing life science organizations with deeper, real-time insights into patient experiences and outcomes.

Veradigm recently developed a real-world database focusing on AI-enabled GLP-1, designed to help Life Sciences companies understand and optimize treatment outcomes. Through advanced AI-driven data curation, the platform extracts real-world insights from clinician notes, including reasons for discontinuation of GLP-1 therapy, such as adverse events and lack of efficacy. It also identifies unlabeled use patterns and associated comorbidities that may affect therapeutic decisions. Combined with clinical validation to ensure data accuracy and reliability, Veradigm's solutions provide the right and right evidence to accelerate research and improve patient care strategies.

For more information about Veradigm's approach, download the free paper below.


[i] GlobalData: Tech Sentiment Polls Q1 2025, April 2025. https://www.globaldata.com/store/report/tech-sentiment-sentiment-polls-quarterly-anasys/
[ii] https://www.nobelprize.org/prizes/chemistry/2024/press-release/
[iii] Zhao L, Wang J, Pang L, Liu Y, Zhang J. Gansdta: Prediction of drug target binding affinity using Gans. Front Gennet. January 9, 2020; 10:1243. doi: 10.3389/fgene.2019.01243. PMID: 31993067; PMCID: PMC6962343. https://pmc.ncbi.nlm.nih.gov/articles/pmc6962343/
[iv] https://www.researchgate.net/publication/370071234_improving_patient_pre-screening_for_clinical_trials_assisting_physicians_with_large_language_models
[v] https://www.pharmaceutical-technology.com/features/the-ai-advantage-in discovering-new-medicines/
[vi] GlobalData: September 2024, Healthcare Artificial Intelligence
[vii] Quantumpharm Inc 2024, Prospectus. https://ir.xtalpi.com/media/uopbz44q/2024060400059.pdf
[viii] GlobalData: Analyst briefing February 20, 2025. https://pharma.globaldata.com/analysis/details/using-artificial-intelligence-to-regenerative-medicine
* https://investor.veradigm.com/news-releases/news-release-details/veradigm-advances-glp-real-world-evidence-generation-ai-driven






Source link

Leave a Reply

Your email address will not be published. Required fields are marked *