Looking at AI for verification rather than as a major decision-making tool, it made a bad impression, but only partially reduced.
While clinicians believe that generator artificial intelligence (AI) tools to support medical decision-making are valuable, they also recognize that their colleagues using the technology are less skilled, research shows.
In the experiment, physicians who used generative AI as their primary decision-making tool received physicians who received significantly lower ratings for clinical skills, competence, and overall healthcare experience provided compared to physicians who had no AI at all (Johns Hopkins University, Baltimore, MD) and physicians who received significantly lower ratings for clinical skills, competence, and overall healthcare experience provided compared to paper reports from the Koleglug Report. Recently published online in NPJ Digital Medicine.
Framing AI only as validation aid improved the rating somewhat, but even worse than the ratings of doctors who did not support the tool.
“As AI tools become more commonly used in healthcare and medicine, I think this really just shows that there is a barrier to challenges, an increase in adoption and use.”
“But it also emphasizes the need for a thoughtful approach to its implementation,” she added. “People need to understand the specific AI tools we use – what it does and how it helps – make sure it's fair in its use, it helps, and it doesn't exacerbate the problems and disparities that it may exist.”
Since the introduction of ChatGPT in November 2022, the use of generator AI has increased dramatically worldwide, including within the medical community. In early 2024, researchers found that over 70% of healthcare organizations were either heading towards AI adoption or already integrated into their workflows.
Old computerized tools designed to aid clinical decision-making have faced a variety of barriers, but the generator AI is “showing a major change with the ability to process free-form, unstructured data, generate human-like responses, provide rapid insights, and provide more flexible and accessible tools,” writes Yang et al.
However, medical AI applications remain somewhat limited, with researchers suggesting that one potential barrier is concerns among physicians that their reputation will be hit by the use of the tool.
People need to understand the specific AI tools we use. Lisa Wolf
Yang, Wolves, and colleagues explored the possibility of a randomised experiment that included 276 clinicians, including 178 physicians, 28 fellows/residents, 60 advanced practice providers and 10 other clinical roles through the Johns Hopkins Medical System. Participants were presented with randomized vignettes for physicians to evaluate patients with diabetes, recommend new antihydropronemia medications, and view them within three different contexts.
- Using Generated AI (Control)
- Generation AI used key decision making tools
- Generated AI used to validate clinical evaluations of physicians
On the scale of 1-7, physicians who used AI for key decisions received a lower rating of clinical skills compared to those who did not use AI (average 3.79 vs. 5.93; p <0.001). Physicians who used AI for validation were classified between the two (average 4.99).
This pattern was also seen for assessment of competence and overall healthcare experience, but these were mediated by assessment of clinical skills. IE, negative ratings of clinical skills also lowered ratings for these domains.
Weakness or strength?
The findings state that “the literature that receives broader advice shows that reliance on external input is perceived as a weakness rather than strength.”
It is despite the fact that clinicians participating in the study rated the generated AI as helpful in ensuring the accuracy of physician clinical assessments, both overall (average 4.30) and when the AI tools are customized for their own institutions (average 4.96).
“These findings suggest that clinicians consider genai useful, but their use may have a negative impact on peer assessment,” writes Yang et al.
And that's not necessarily fair, Wolf said. “I think that with AI and genai becoming more ubiquitous, more people will use it. And I think our perspective and approach to this must change.”
For example, for rural primary care physicians without many experts, generative AI, for example, could prove to be a very useful resource in certain cases, Wolf said.
However, she noted that there is still a huge gap between the development of AI tools and clinical implementation. “But I think that will change over the next few years,” she said. “The way we perceive this and how to implement it must be very thoughtful. But we must also start working with AI and training the next generation of doctors on how to safely and effectively use AI.
