How machine learning can change the pharmaceutical industry

The use of machine learning (ML) models has gained particular attention in industries where error-proneness is a major problem. Only one in ten drug candidates that enter the (already exclusive) human clinical trial stage are brought to market. As the need for new treatments increases, drug discovery and development is definitely a promising area for ML. felix We spoke to Professor Miraz Rahman about whether or when we can expect ML to change the way medicines are discovered and delivered to the public.

Professor Rahman has extensive experience in applying advanced computational methods to understand drug resistance. His research on DNA binding agents has resulted in significant results, including patented anti-cancer molecules, contributions to a total of 24 filed patents in the field of drug discovery, and the creation of two clinical-stage university spin-out companies.

He provided a grounded perspective on the hype, hurdles, and exciting advances of machine learning in the pharmaceutical industry.

The big picture: The pharmaceutical market is highly complex

Since the 1950s, drug discovery has become progressively slower and more expensive. Despite rapid advances in technology, the number of approved drugs per billion USD halves every nine years.

However, machine learning has the potential to be transformative at every stage, and “ultimately has the potential to intervene at every stage of drug discovery,” Professor Rahman explained.

One of the major sources of funding in drug discovery is clinical trials. Clinical trials test a drug’s safety and effectiveness on a gradually increasing patient population. Although ML models are typically considered useful in the early stages of drug discovery, ML can also play a transformative role in clinical development by integrating preclinical data with patient population characteristics to optimize trial design, stratification, and endpoint selection. According to Professor Rahman, there is great untapped potential in streamlining clinical trials. “Even if you shave one year off the length of a clinical trial, that means savings in the tens to hundreds of millions of millions.”

He highlighted the role of machine learning in clinical formulation, the process of developing the final treatment administered to patients. It is equally important not only to identify promising drug candidates, but also to determine their ideal dosage, absorbance form, and physiological effects on the body. As Professor Rahman explained, machine learning can integrate clinical and biopharmaceutical data to refine dose selection and patient response more efficiently than traditional empirical approaches. As this stage moves closer to regulatory approval and market entry, optimization using machine learning has significant commercial value, reducing the risk of late-stage failure, accelerating timelines, and increasing the likelihood of clinical and commercial success.

However, he says that ML in this context still has a very long way to go, and that is not necessarily reflected in the media attention. “Baby steps make a lot of noise, but they’re still baby steps.”

being wrong is undervalued

When it comes to machine learning, the quality of the input is the same as the quality of the output. In drug development, this is reflected in the importance of negative data, or cases that do not exhibit the quality that the model is trying to identify.

Machine learning models can learn as much from failed experiments as from successes. For example, training a model using a less biased dataset can often identify false positives more accurately. In drug discovery, this is particularly advantageous for classification tasks such as ranking drug candidates by target affinity. Overall, the better the balance between positive and negative data, the better the model will perform when analyzing or predicting real-world situations.

Professor Rahman agreed that overwhelming publication bias against positive data is a major difficulty in training ML models. He stressed that the main reason for this was a lack of willingness by academics and journals to publish such findings, saying, “Good headlines beget better headlines.”

power of integration

Despite this bias toward positive data, there is still a large amount of data that can be difficult to process with ML models, even with dedicated supercomputers. Professor Rahman believes that integrating ML into existing processes is the most attractive approach.

Connecting ML predictions directly into your current pipeline makes your applications more specific, decisive, and actionable. Professor Rahman explained that models need to be gradually incorporated into drug development pipelines, from target identification to clinical trial optimization. These more complex models can be applied to model specific disease responses to drugs, such as altered responses to antibiotics and the effects of multicomponent systems that may be observed in vivo, such as competition with nutrients and commensal bacteria in the gut.

However, Professor Rahman emphasized that “even when integrating models, we need to respect the models.” [the] uniqueness [of] each dataset. ”Not all biomedical datasets are compatible. Each has its own level of noise, bias, size, and biological significance. Person training models must account for such different metadata to avoid flattening important distinctions and ensure that predictions at each stage are biologically accurate.

When and how will ML change the industry?

Professor Rahman explained that machine learning is still in its very early stages and work in wet labs is essential to provide representative, high-quality data on which to base good models. Wet lab work remains necessary as an engine for providing and validating computational insights. The more sophisticated a model, the more it must rely on rigorous experimental design and careful data generation to ensure that the model is grounded in reality. Therefore, while machines will not immediately eliminate the need for laboratories, scientists will increasingly need to develop skill sets focused on understanding and critically using these models.

Professor Rahman emphasized that in this situation, we should not fear this change, but rather embrace it and treat it with the respect it deserves. He explained that it is essential to teach the rational use of these technologies in research and their role in the creation of new knowledge.

Despite rapid progress, he stressed that unlocking the true potential of machine learning in drug discovery requires collaboration beyond the private sector. Training better models requires large, high-quality datasets, shared infrastructure, and long-term investments, resources that are difficult for individual companies to build on their own.

He asserted that government-led initiatives play an important role in funding large-scale collaborative research programs, and in particular facilitate the quest for responsible and balanced data sharing across industry and academia. “We need a better model, but I don’t see how the private sector alone can deliver it,” he explains.

Source link