Long-term use of AI can pose health and job risks: 4 ways to stay safe

adeephole-gettyimages-856656258 — Puneet Vikram Singh, nature and concept photographer/Moment via Getty Images

Follow ZDNET: Add us as your preferred source Google.

Important points of ZDNET

While AI is getting better at small tasks, it still lags behind long-term analysis.
The consequences of long-term interactions with AI can be dire.
Use AI like a tool to perform well-defined tasks and avoid falling down rabbit holes.

It’s better to do something a little better than something really bad. So said the great philosopher Socrates. His advice can be applied to the use of artificial intelligence, including chatbots such as OpenAI’s ChatGPT (Perplexity) and agent AI programs, which are increasingly being tested within enterprises.

In AI research, it is becoming increasingly clear that the safest and most productive way to use AI is to use it for small, limited tasks with well-defined and verifiable outcomes, rather than pursuing extensive interactions with the technology over hours, days, and weeks.

Related article: Ask AI for medical advice? There’s a right and wrong way to do it, one doctor explains

Prolonged interactions with chatbots such as ChatGPT and Perplexity can lead to at least misinformation and, in some cases, paranoia and death. This technology is not yet ready to respond to the most sophisticated kinds of demands, such as reasoning, logic, common sense, and deep analysis, the areas where the human mind matters most.

(Disclosure: ZDNET’s parent company, Ziff Davis, filed suit against OpenAI in April 2025, alleging that it infringed on Ziff Davis’ copyrights in training and operating AI systems.)

We haven’t yet reached AGI (Artificial General Intelligence), the human-level capabilities of AI, so it’s good to keep in mind the limitations of the technology when using it.

Simply put, instead of getting sucked down the rabbit hole and getting sucked into endless AI conversations, use AI as a tool.

What AI is good at and what it’s bad at

While AI works well at simple tasks, it tends to struggle with complex, deep types of analysis.

The latest example is the key takeaways from the Annual AI Index 2026, released this week by a group of human-centered AI scholars at Stanford University.

Meanwhile, Editor-in-Chief Sha Sajadier and his collaborators have found that agent AI is becoming increasingly successful at tasks such as searching for information on the web. In fact, agents are nearly human-like in everyday online processes.

Related article: 10 ways AI will cause unprecedented damage

Through three benchmark tests: GAIA, OSWorld, and WebArena, Sajadieh and his team found that agents performed close to human-level performance on multi-step tasks such as opening databases, applying policy rules, and updating customer records. On the GAIA test, the agent’s accuracy rate was 74.5%, which is still below human performance of 92%, but up from 20% a year ago.

In OSWorld’s tests, “computer science students solve about 72% of these tasks in an average time of about 2 minutes,” compared to 66.3% for Anthropic’s Claude Opus 4.5, which until recently was the most powerful model. In other words, “the best model” [is] That’s within 6 percentage points of human performance. ”

WebArena shows that the AI model’s accuracy “came within 4 percentage points of the human baseline of 78.2%.”

stanford-high-agent-ai-on-gaia-test — Agentic AI has gotten better at online tasks like web browsing, but it still hasn’t reached human-level accuracy.

stanford

Claude Opus and other LLMs are not perfect, but they do show rapid progress in reaching benchmark levels that at least approach human-level performance.

This makes sense, since interacting with a web browser or searching for something in a database are some of the simple scenarios where natural language prompts can be plugged into APIs and external resources. In other words, the AI must interface with the application in a limited way and have most of the equipment needed to perform the task.

Also, 40 million people around the world use ChatGPT for healthcare, but is it safe?

Note that even with well-defined and limited tasks, the average scores for these benchmarks are still below human performance, so it helps to see what you can get out of your bot. This is in a benchmark test, which is a kind of simulated performance. In real-world settings, results can vary, and not necessarily for the better.

AI cannot handle difficult things

When Stanford University scholars looked into a deeper type of research, the results were less encouraging.

The researchers found that “while the model handles simple searches well, when asked to find multiple matches or apply conditions across very long documents, it becomes a trivial task for humans scanning the same text.”

This finding is consistent with my own anecdotal experience using ChatGPT to create a business plan. The answers were fine for the first few prompts, but then the quality of the answers deteriorated as the model snuck in facts and figures that I hadn’t specified, or that might have been relevant earlier in the process but weren’t in the current context.

My conclusion is that the longer the ChatGPT session, the more errors creep in and the more frustrating the experience becomes.

Also: I created a business plan with ChatGPT and learned my lesson.

The consequences of unchecked bot creation can be even more severe. last week’s article nature The magazine describes how scientist Almira Osmanovic Thunström, a medical researcher at the University of Gothenburg, and her team invented a disease called “bixonimania,” which they described as an eye condition caused by excessive exposure to blue light from computer screens.

They wrote a formal research paper on the fabricated condition and published it online. The paper was found through a bot-based search. Most of the large language models, including Google’s Gemini, began dutifully associating the condition Bixonimania in chat, pointing to Tonström and his team’s fake research paper.

The fact that the bot confidently claims the existence of fake bixonimania shows that the technology’s access to information is not monitored. Without proper checks, you don’t know if your model will validate what it’s spitting out. One academic not involved in the study said: [the AI model] It has a pipeline for continuous evaluation. ”

the consequences can be serious

More severe variants appear to have fallen down the rabbit hole of users confiding in bots, as described in a recent New York Times article by Teddy Rosenbluth about the case of an elderly man battling white blood cell cancer.

Rather than follow his oncologist’s advice, patient Joe Riley relied on extensive interactions with chatbots, specifically Perplexity, to refute his doctor’s diagnosis. He claimed that AI research revealed that he suffers from something called Richter’s change. This is a complication of cancer that is compounded by the recommended treatments.

Related article: Do you use Google AI Overview for health advice? ‘Really dangerous,’ study finds

Despite emails from experts regarding Richter questioning the content of Perplexity’s summary of symptoms, Riley remained steadfast in his belief in the AI-generated report and resisted the pleas of his doctor and family. He missed the opportunity to receive proper treatment, and by the time he relented and agreed to receive treatment, it was too late.

Rosenbluth connects Joe Riley’s story to last year’s case of Adam Lane, who committed suicide after having extensive conversations with ChatGPT about his tendency to end his own life.

Riley’s son, Ben Riley, wrote his own account of his father’s journey with AI. The younger Riley doesn’t blame technology per se, but points out that there can be consequences if you get absorbed in chatting and lose perspective.

“AI do “It exists in our world,” Riley writes, “and just as it can serve as fuel for those suffering from mania, it can also affirm or amplify false understandings of what is happening to us physically and medically.”

Stay sane even with unreliable AI

The temptation to have long discussions about depression, suicide, and serious health conditions is understandable. People are used to long engagements on social media for hours at a time. Some people are lonely. Having a natural language conversation with a bot is better than no conversation at all.

Also: Your chatbot is playing a character – why Anthropic says it’s dangerous

Research shows that bots tend to be picky, so hours of interaction with a bot can be more fulfilling than regular human interaction.

And while the companies behind the technology warn users to verify the bot’s output, they tend to de-emphasize negative reports from individuals like Riley and Lane.

4 rules to avoid going down the rabbit hole

Some rules can help mitigate the worst effects of focusing too much on technology.

Define what you will use the chatbot for. Are there well-defined tasks that are limited in scope and can fact-check the bot’s predictions with other sources?
Have a healthy skepticism. It is well known that chatbots are prone to confabulation and confident false claims. It doesn’t matter how many chatbots you use to balance the pros and cons. All of this should be treated with a healthy dose of skepticism, as only part of the truth, if any at all, is true.
Don’t treat chatbots as friends or confidants. Digital tools such as Word and Excel. You’re not trying to build a relationship with the bot, you’re trying to complete a task.
Use your proven digital overload skills. Take a stretching break. Step away from your computer for some non-digital human interaction, like playing a card game with a friend or going for a walk.

Also, please stop saying that AI hallucinates. Actually, that’s not the case. And that misrepresentation is dangerous

Falling down the rabbit hole occurs in part due to simply being parked in front of a screen without any downtime.

Source link