According to a new study, what actually happens when programmers use AI is hilarious

AI has taken the programming world by storm, and speculation about high-tech replacing human coders has skyrocketed, with Google CEOs claiming that 25% of the company's code being generated by AI recently.

However, in reality, AI may actually be preventing efficient software development. As if flagged Ars TechnicaIn a new study of non-profit model assessment and threat research (METR), programmers actually do not actually have the same problem. Slower If you use AI Assistance Tools, rather than creating without them.

The study was given around 250 coding tasks to 16 programmers, asking them to either not use AI assistance or adopt what they characterized as “early 2025 AI tools” such as Anthropic's Claude and Cursor Pro. The results were amazing, and perhaps profound: the programmer actually spent 19% more Time to use AI than when using AI.

When measuring programmers' screen time, the METR team found that using AI tools reduced the time that subjects actually actively spent coding, debugging, researching, or testing, but instead spent time “reviewing AI output, promoting AI systems, and waiting for the AI generation.”

Ultimately, the AI-assisted cohort accepted less than 44% of the tips provided by the tool without modification, and 9% of the total time spent on the task was eaten up by modifying the AI output. (That's not entirely surprising. Companies that fired people to replace people with AI must hire new contractors to fix technological mistakes.)

However, despite the results, the programmers of this study initially thought that AI would reduce almost a quarter of the time spent on tasks. After that, they thought those tools would increase them by 20%.

Perhaps contributing to expectations and reality disconnection in AI coding is all the benchmarks that these tools and others claim to be spitting out perfect code at record speeds, such as Openai's O3 Reasoning Model and Google's Gemini. As ARS However, these benchmarks rely on “specifically created synthetic algorithmically reconnaissance tasks” for such testing.

This is not the first time that the story of AI domination in coding has rattled out by research results. For example, earlier this year, Openai researchers released a paper that declares based on benchmark tests from real-world coding tasks. This states that even the most advanced, large-scale language models “cannot solve” most of the problems yet.

AI also produces other unintended consequences in the world of software development. Writing and modifying code by explaining what is needed by an untrained programmer or AI, or by explaining what is needed, not only ruins the work itself, but also self-sabotage by introducing serious cybersecurity risks into the complete product.

With so many tech workers being fired in favor of automation, the code generated after such a shooting is because it is less accurate and safe than it was when humans were writing it, but so far it doesn't seem to matter much to the people at work.

AI coding details: Researchers trained AI with flawed code, which became a psychopath

Source link