Why can’t AI replace human functions?

Generative AI systems have made tremendous advances in recent years. You can also write poetry, create complex computer code, create photorealistic images, and even generate full-motion videos from text prompts. The results are amazing.

But will these systems ever be fully replicated and replace humans?

To answer that question, we need to look at AI along three different axes: functional ability, economical alternativeand spiritual value. By doing so, a clear picture emerges. Generative AI is powerful, but it has fundamental limitations. We have published videos on each of these axes ( here , here , and here ), but in this post we look at why AI will not functionally replicate or replace humans.

Claude Shannon to Claude.AI

The roots of modern large-scale language models go back to 1948, when Claude Shannon, the father of information theory, conducted a simple but far-reaching experiment. He tried to build a text prediction engine that would generate English by predicting the next word based on the previous word.

Using so-called second-order word approximations (bigram models), Shannon generated the following text:

The character of this point, therefore, was another way of the letters of the times, which spoke of problems never before seen by anyone, and struck head-on and British writers head-on.

It didn’t make any sense, but there were some local consistencies. “Front attack on British writers” sounds like a news headline. The system used one of the simplest language models imaginable to predict the next likely word based solely on statistical frequency, yet it was able to produce semi-grammatical text that was easy to read.

But what if we scaled up this model? Instead of conditioning on the previous word, what if we conditioned on the previous 3,000, 30,000, or 300,000 words? We’d end up with something closer to Claude or ChatGPT.

These models produce consistent and readable text. For example, consider the following for OpenAI’s GPT-5:

Beneath the spire where scholars tread,
The cam flows softly through the willows.
Old stones and ivy protect the past;
A place where thoughts continue like whispers in the breeze.

A bicycle makes a noise as it rides along the cobblestone road.
Newton’s eyes light up my heart.
Dusk envelopes the blessings of the chapel,
In the dreaming tower, time leaves its mark..

It’s not Shakespeare, but it’s poetry.

Modern generative AI systems can do more than write prose and poetry. They can create apps, plan trips, create lesson plans, prove number theory conjectures, perform what looks like rational reasoning, create images and videos, compose music, and have conversations. They summarize research papers, flesh out podcasts, and provide misinformation and disinformation. Generative AI systems are like a digital wizard’s spellbook, waiting for the right spell (prompt) to bring to life what previously took days or months of human effort.

What gives these powers?

How large language models actually work

Two central ideas underpin Generative Large-Scale Language Models (LLM): Note and embedded.

Text embedding works by analyzing statistical relationships between words. Words that appear in similar contexts are represented as vectors located close to each other in a high-dimensional mathematical space. For example, take a look at the image below, which specifically associates “animal” with “food.”

“Fish”, “bear”, and “wolf” are flocking together as animals.
“Salad”, “Stew”, and “Fish” are fixed as food.

Interestingly, ‘fish’ occupies both semantic neighborhoods, so its embedding vector is contained in their intersection.

Large-scale language models and diffusion models are deep neural networks that convert words and pixels into numbers and vectors. they convert meaning For mathematical geometry problems. They encode relationships between symbols.

But here’s the important point. They know what statistically looks like, but they don’t know what it actually is. They are trapped behind walls of syntax and numbers.

fantasy of reasoning

That seems to be the reason for the LLM. But is it?

In 2022, a paper from UCLA’s StarAI Lab showed that models that perform nearly perfectly on inference tasks break down if the problems are chosen slightly differently (i.e., the distribution of problems changes). If the system is actually aware of formal logic, selecting problems in different ways is not a problem. You will apply the same set of logical rules to each problem. You can choose problems from your textbook or get them from your friends. But when the system is performing what appears to be logical reasoning based solely on surface-level statistical pattern matching, the difference in problem selection becomes very important. And it happened.

In 2024, Mirzadeh and colleagues further demonstrated that adding extraneous information to a math problem can reduce the performance of state-of-the-art models by up to 65 percent.

Consider the following example from their paper.

Oliver picks 44 kiwis on Friday, 58 on Saturday, and twice as many on Sunday as on Friday. Five were smaller on average. How many did he choose?

The extraneous detail about size caused the model to subtract 5 and produce 185 instead of the correct answer of 190. Why do extraneous details derail mathematical reasoning systems?

The answer is that these systems were not rational. They were doing statistical pattern matching. If word problems in the training data contain real-world patterns that resemble irrelevant information, the system is conditioned to treat irrelevant information as relevant. The system’s lack of understanding allowed it to be fooled by patterns.

Some might argue that humans can also be fooled by statistical patterns. This is true. Once we are fooled by spurious statistical regularities, we are no longer able to perform valid rational inferences. We’re doing something completely different.

That’s my point.

Model collapse: when AI trains with its own output

AI may not be like that Really They perform rational reasoning, but do they produce information of similar quality to humans?

In 2023, Shumailov and colleagues showed that models trained on unique outputs begin to degenerate. If you repeat this process, the following symptoms will occur: model collapse. By the 8th or 9th generation, the system produces incoherent nonsense.

Therefore, there is a fundamental asymmetry between the information produced by humans and the information produced by LLMs. If you write text and feed it to the model, the model will improve. The situation gets even worse when the model is trained with its own output.

Why do models break down? Because finite data cannot fully represent the richness of the distribution. A finite approximation model cannot perfectly represent the living agent it is intended to model. At best, it remains a simplified caricature.

Information theory principles called Data processing inequality (DPI) It will help you understand why. It can be summarized as follows.

Smart processing cannot increase the information content of a signal beyond what it already contains.

No matter how much training data you collect, that data will forever be a limited snapshot of what it means to be human. DPI cannot be improved by processing with a large language model. It is not possible to create a model of humanity beyond what is contained in finite data. But that data is always lacking.

Imagine collecting all the texts and artifacts from 1725. This would provide a snapshot of humanity, but an incomplete one. Doing the same thing in 2026 would still give an incomplete picture. No matter how many books, blog posts, and images we collect, no finite dataset can fully capture what it means to be human.

AI systems are only trained on finite snapshots, thus permanently producing imperfect replicas.

Syntax is not semantics

Finite data and imperfect approximations limit AI systems, but more fundamental limitations persist in AI systems. That is, syntax alone does not provide semantics. The ethereal split between formal, symbolic, and surface-level processing (syntax) and the foundations of grounded truth and meaning (semantics) is absolute and unwavering.

This division is not new. In 1929, René Magritte painted a picture. betrayal of image — Picture of a pipe labeled (translating)this is not a pipe”The expression is not the thing itself.

Formal logic systems were designed to mechanize reasoning and remove all elements of human subjectivity. Doing so allowed machines to process logical proofs, but logical systems became formal games, with the rules of the game (formal syntax) separated from the truth or meaning behind the symbols (semantics).

In the 1930s, Kurt Gödel showed that semantics is more important than syntax. His famous incompleteness theorem proved that for any formal system in which arithmetic can be expressed, there are true mathematical statements that cannot be proven within that system.

Gödel accomplished this by constructing clever self-referential statements similar to “This statement is unprovable.” He encoded it as a mathematical formula in his system.

A statement is false if it is provable. If it can’t be proven, it’s true, meaning there are true mathematical statements in the system that can’t be proven.

Therefore, the world of truth is strictly larger than the world of symbols. Pure syntax cannot fully encompass semantics.

AI systems are formal symbol processors trapped behind syntactic walls. Philosophers like John Searle, through his discussion of the Chinese Room, and computational linguists like Emily Bender have argued that symbolic manipulation alone does not produce understanding. You can’t do that either.

A large language model fluently rearranges tokens. But symbolic fluency is not comprehension. A formal game is still a game, no matter how much it imitates reality.