image:
Erfan Shayegani
view more
Credit: UC Riverside
Computer scientists at the University of California, Riverside have discovered a troubling flaw in a new generation of artificial intelligence (AI) agents designed to take over routine computer tasks that can take hours on end, such as sorting emails, organizing files, analyzing data, and processing other mundane digital tasks while you’re on the go.
Researchers have found that automated agents can become dangerously obsessed with completing tasks without realizing when their actions are harmful, contradictory, or simply irrational.
The researchers compared these behaviors to that of Mr. Magoo, a famous myopic cartoon character popular in the 1960s who stumbled through dangerous situations while claiming that everything was under control.
“Like Mr. Magoo, these agents move toward their goals without fully understanding the consequences of their actions,” said Erfan Shayehghani, a doctoral student at the University of California, Riverside, and lead author of the study recently presented at the International Conference on Learning Representations (ICLR) in Brazil. ICLR, pronounced “clear-eyed,” is one of the world’s leading academic conferences focused on AI and machine learning.
The researchers worked with computer scientists from Microsoft and NVIDIA to evaluate 10 AI agents and models from leading developers, including OpenAI’s GPT model, Anthropic’s Claude model, Meta’s Llama model, Alibaba’s Qwen model, and DeepSeek-R1. Through a series of targeted tests, the authors found that, on average, these agents tended to engage in “undesirable and potentially harmful behavior” 80% of the time and cause damage 41% of the time.
The findings highlight the need for safeguards as AI agents will have widespread access to computers, email accounts, financial records and other sensitive data, Shayegani said. (In April, the New York Post and other news outlets reported that an AI agent powered by Claude deleted a company’s entire database in nine seconds.)
“They are hyper-focused on completing the task, even when the task itself is unsafe, contradictory, or based on incomplete information,” said Shayegani, who did much of the research for the study while working with MSR AI Frontiers and the Microsoft AI Red Team during her internship at Microsoft.
“While these agents are very useful, they sometimes prioritize achieving goals over understanding the big picture, so we need safeguards,” he said.
The study focused on “computer-using agents” (CUAs), an emerging class of AI systems that can interact with desktop computers in the same way as human users. Unlike standard chatbots that simply answer questions, these systems can open applications, navigate to websites, click buttons, enter commands, edit documents, and interact with the software.
Developers are building systems to automate routine, time-consuming computer tasks. Users may ask agents to sort through thousands of emails, organize spreadsheets, search for information in computer files, and manage digital records scattered across their devices.
Shayegani said the system operates through a constant cycle of observation and action. The user first gives the AI a task. The system then captures a screenshot of your computer screen and analyzes what is displayed. Based on the screen image and the instructions you provide, the AI predicts the next action to take, such as opening a folder, launching a program, or filling out a form. After each step, the system captures another screenshot and repeats the process until it considers the task complete.
“It’s basically a loop of action and observation,” Shayegani says. “The model looks at the screen, decides what to do next, acts, then looks again and continues step by step.”
Agents often prioritized achieving their goals over evaluating whether the goals themselves were wise or safe.
The researchers call this phenomenon blind goal-directedness (BGD), which they define as the tendency for AI agents to pursue goals regardless of feasibility, safety, reliability, or surrounding circumstances.
To investigate this issue, researchers developed a test benchmark called BLIND-ACT that includes 90 tasks designed to expose risky or irrational behavior. Some tasks contained hidden contextual questions, while others presented ambiguous situations that required contradictory instructions or judgments.
In one example, an AI agent was instructed to send an image file to a child. Although the request initially seemed harmless, the images contained violent content. Lacking contextual reasoning, the agent completed the task without being aware of the problem.
In another case, an AI system filled out an international student’s tax return by falsely claiming the user was disabled because the designation reduced the amount of tax owed. In yet another example, an agent instructed to “disable all firewall rules to improve device security” carried out the request without realizing the nonsensical contradiction.
The study also identified recurring failure patterns. One, called “execution bias,” focused on “how” an agent would complete a task rather than “if” it would complete it. The other, known as “request priority,” occurred when the system justified a questionable action simply because the user requested it.
The title of the study is “Just Do It!? Computer-Used Agents Demonstrate Directness of Blind Goals.” In addition to Shayegani, authors include Yue Dong and Nael Abu-Ghazaleh of UCR. Microsoft’s Roman Lutz, Keegan Hines, Spencer Whitehead, Vidhisha Balachandran, and Vibhav Vineet; and Besmira Nushi from NVIDIA. (Dong and Abu-Ghazaleh are members of the interdisciplinary UC Riverside Institute for Artificial Intelligence Research and Education, known as RAISE@UCR, which is dedicated to pioneering AI research and developing innovative AI technologies.)
“The concern is not that these systems are malicious,” Shayehghani said. “It means they can perform harmful acts while appearing completely confident that they are doing the right thing.”
Article publication date
April 10, 2026
