Leverage new machine learning approaches to strengthen diversity and inclusion efforts in research and development

Machine Learning


historically,Clinical trials have relied heavily on white,male participants, but today we know that such a practice creates knowledge gaps in how to understand the natural progression and treatment of acute and chronic conditions across diverse patient populations.,Therefore, to ensure efficacy of treatments within all groups, it is essential to enroll sufficient numbers of patients from all genders, racial, and ethnic groups, in numbers that reflect current population proportions.

Despite many efforts over the decades to address disparities, the underrepresentation of gender, racial, and ethnic minorities in research and development continues. For example, African Americans make up 13.4% of the US population, but only 5% of trial participants. Although Hispanics make up 18.1% of the U.S. population, they accounted for less than 1% of study participants. These enrollment disparities have real implications in terms of improving health outcomes for traditionally underrepresented communities, and industry stakeholders are recognizing the need to achieve meaningful change. .

For example, in 2022, the U.S. Food and Drug Administration: draft guidance for clinical trial sponsors to develop and submit. A “racial and ethnic diversity plan” for the pilot program before finalizing the study design. Early planning and proactive thinking ensure that drug developers take concrete steps towards progress. The plan should include specific enrollment goals and operational strategies to achieve them. Underrepresented and underserved racial and ethnic patient populations.

In seeking innovative approaches to improve diversity and inclusion in clinical trials, industry participants are leveraging advanced technology and the rich data sources available to better understand patient needs, burdens, and more. , increasing interest and participation in the trial.

This allows a machine learning (ML) program to generate a list of trial sites that are likely to result in more patient enrollment based on the trial protocol, including eligibility criteria, previous trial performance, claims data, and patient demographics. We were able to see how we can rank effectively. However, by constantly fine-tuning ML capabilities, we find that purposefully designed deep reinforcement learning frameworks can better learn inclusion priorities while optimizing study site selection.

This article describes how deep learning frameworks can specifically address real-world challenges and enhance site selection while increasing trial diversity.

Real-world challenges of site selection

To address enrollment disparities, clinical trial sponsors use data-driven methodologies to calculate the burden of trial protocols by race and ethnicity and participate in local events (such as health fairs) to We are increasing patient engagement by increasing awareness and providing culturally relevant communications to patients for stronger engagement. Much more. Early planning has shown that a multi-pronged approach can be effective.

To maximize the likelihood that a site will obtain an appropriately representative sample of participants from diverse backgrounds, sponsors should delve deeper into the nuances of site identification and evaluation methodologies.but There are two notable barriers to site identification that impact improving equity in clinical trial participation.

data is missing

Site identification often begins with patient and public engagement, site visits and clinical research coordinator participation, and analysis of claims and specialty data, including past enrollment performance and recruitment rates. These datasets can best inform a region's patient enrollment potential.

However, an issue to keep in mind is that trial sites with large minority populations are more likely to have potential data gaps due to poor data collection and reporting.However, race and ethnicity data reported from trials To increaseHowever, this is still a work in progress. Existing tools cannot address missing data, so missing sites in areas with large ethnic minority populations only exacerbates underlying inequities.

Registration and diversity trade-offs

Adding diversity as a qualifier to your target site to maximize enrollment can be difficult. Equity cannot simply be imposed by setting quotas for each racial or ethnic group because the small population of participants selected by existing approaches effectively sets a cap on enrollment. The trade-off between enrollment needs and fairness must be balanced, so both objectives must be optimized simultaneously.

Where deep reinforcement learning models can help

Given what needs to be considered to address the challenges above, clinical trial sponsors are wondering how best to optimize multiple site parameters to ensure better enrollment rates in diverse patient populations. need to be determined.

Improving the design and execution of clinical trials using ML solutions spread Over time, skilled data scientists gather more practice-based insights and apply them to further fine-tune ML-based models to the task at hand. Today, ML helps test hypotheses about trial feasibility, extract meaningful patterns in patient outcomes, drive trial design, predict trial outcomes, and more.

We can go a step further and use the ML subfield of deep learning to best doctor Conduct and maximize research Patient recruitment. However, patient diversity is not taken into account.

To improve site selection with diversity and enrollment in mind, data scientists use data points from approximately 4,400 real-world clinical trials from 2016 to 2021 to I tested a reinforcement learning model. The results show that this framework accounts for some important variables that can be better addressed. Lack of site-specific data and the challenge of registering and diversity trade-offs, or vice versa.

Modality encoder for missing but required data

Most ML-based research assumes that datasets are complete and properly cleaned, but this is not the case in most real-world applications where data is often incomplete and results are skewed. It's not realistic. Recognizing the need for a more unified view across sites regarding missing or insufficient insights into data, data scientists are increasingly taking data from multiple sources and combining, enriching, and enriching it to create a more comprehensive We tested this framework to avoid unavailability by providing data that is not available. Photos from each site. Moreover, even when data is missing, once the global overview is available, its content can be inferred accurately.

Other existing ML-based strategies, such as modality dropout and cascading residual autoencoders, do not directly model missing data. However, this framework allows us to construct a more accurate representation of clinical trial sites without complete site data.

Efficient trade-offs: the “reward system”

To rank findings and site characteristics based on what is ideal for a particular trial, this deep reinforcement learning model includes an emphasis on fairness metrics in terms of enrollment and inclusion of diverse participants. The reward function is specifically integrated.

Since there are no longer any “data holes” in the representation of trial sites, this feature values ​​individual site contributions in relation to the “rewards” given to the feature. By using a reward system where the final reward is chosen as the target site for the trial, this model uses an encoding layer to show that each site's ranking/score is influenced by features of other sites. enable. As seen in Figure 1 below, this highlights which sites are ideal for overall enrollment and reach a diverse population.

Visual representation of deep reinforcement learning models considering unbiased ranking with missing modalities.


Figure 1:
Visualization of deep reinforcement learning models considering fair ranking with missing modalities. This framework uses multimodal site features and clinical trial representations to generate scores for ranking and selection of a subset of prospective clinical trial sites. The pipeline used to do so consists of a modality encoder, a missing data handling mechanism, a scoring network, and a reinforcement learning-based ranking approach. Credit: Theodorou B, Glass L, Xiao C, Sun J. 2024. CC BY 4.0.


Build use cases consistently


The ML-driven solution is part of a more holistic approach to optimizing site selection that prioritizes adequate representation in trials from diverse patient populations. Therefore, it is important that data scientists and other stakeholders constantly refine their techniques to better uncover the insights of interest in a fair and accurate manner.
You need to ensure that the ML approaches used are grounded in sound science and guided by the right group of clinical trial subject matter experts, including medical, clinical, and data science experts.

Because deep reinforcement learning is based on insights gleaned from trial and error usage, these models continually evolve to meet immediate needs. When currently in use, the new model described above allows trial sponsors to avoid the need for complete data and limit or eliminate bias within that input, protecting them while increasing the diversity of enrolled patient groups. It helps the industry successfully address the long-standing challenge of selecting facilities that can. Admission rate.

The opportunity for this model and other deep learning tools to help drive smarter decisions in research and development to enhance patient care will arise over time and as more insights are collected to explore. Probably.

About the author:

Greg Lever is the Director of AI Solutions Delivery at IQVIA. With over 14 years of experience in science and technology, Greg currently works within IQVIA's Applied Data Consulting Sales team at his Science Center to innovate to deliver life-changing treatments to patients faster. We help our clients discover ways to Previously at IQVIA, he led a team of machine learning engineers within the Analytics Center of Excellence.

Greg has worked with several technology start-ups in London and supported Genomics England's 100,000 Genomes Project to project completion. He completed his PhD at the University of Cambridge, combining quantum physics and machine learning to develop new approaches to small molecule drug discovery, and has worked as a postdoctoral researcher at MIT.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *