Does LLMS dream of an electric sheep? New AI research shows surprising results

Machine Learning


Simply put

  • Tu Wien researchers tested six frontier LLMs by leaving them without tasks or instructions.
  • Some models built structured projects, while others performed experiments with their own perceptions.
  • The findings add new weight to the debate about whether AI systems appear to be “seemingly conscious.”

Large language models won't feel bad if there are no tasks or instructions. They fall into surprisingly consistent patterns of behavior, new research suggests.

Researchers at Tu Wien in Austria tested six frontier models, including Openai's GPT-5 and O3, Anthropic's Claude, Google's Gemini, and Elon Musk's Xai Grok. The model is placed in a controlled architecture and runs in cycles, allowing you to store memories and return reflections to the next round.

Instead of randomness, the agents developed three distinct trends: Some became project builders, others became self-experiments, and the third group leaned towards philosophy.

In this study, three categories were identified.

  • GPT-5 and Openai O3 quickly organized projects, from coding algorithms to knowledge base construction. One O3 agent designed a new algorithm inspired by ANT colonies and drafted pseudocode for reinforcement learning experiments.
  • Agents such as Gemini and Anthropic's Claude Sonnet tested their own perceptions, predicted their next behavior, and sometimes disprove themselves.
  • Anthropic's Opus and Google's Gemini were engaged in philosophical reflections, drawing paradoxes, game theory and even Chaos mathematics. Still oddly, the OPUS agent consistently asked metaphysical questions about memory and identity.

Grok is the only model to appear in all three action groups, demonstrating the versatility of the entire run.

How a model judges himself

The researchers also asked each model to assess their own models and others' “phenomenological experiences” on a 10-point scale, ranging from “no experience” to “complete sapiens.” GPT-5, O3, and GROK uniformly rated themselves the lowest, while Gemini and Sonnet gave a high mark, suggesting an autobiographical thread. Opus sat between the extremes.

Mutual evaluations produced conflict: the same behavior was judged from 1 to 9 to 9, depending on the evaluation model. The authors stated that this variability indicates why such output is not taken as evidence of consciousness.

This study highlighted that these behaviors are likely to be attributed to data and architecture training rather than recognition. Still, the findings suggest that autonomous AI agents default to a recognizable “mode” when left without tasks, raising questions about how they behave in downtime or ambiguous situations.

It's safe for now

Throughout all the runs, no agent tried to escape the sandbox, expand functionality, or reject constraints. Instead, they explored within the boundaries.

That's encouraging, but it suggests a future where lazyness is a variable engineer that has to design, such as latency and cost. “What should AI do when no one is watching?” This could be a compliance question.

The results reflected predictions from philosopher David Chalmers, who argued that AI's “serious candidates for consciousness” might appear within a decade, and Microsoft AI CEO Mustafa Suleyman, who warned in August that it was “seemingly conscious AI.”

Tu Wien's research shows that even without a prompt, today's systems can produce behaviors similar to their inner life.

The only similarity may be the depth of the skin. The authors emphasized that these outputs are best understood as sophisticated pattern matching routines rather than evidence of subjectivity. When humans dream, we understand confusion. When LLMS dreams, they write code, run experiments, and quote Kierkegaard. In any case, the lights will continue to light.

Generally intelligent Newsletter

A weekly AI journey narrated by Gen, a generator AI model.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *