Does AI think?
If you know anything about artificial intelligence, this question seems a little silly. The large language models that power ChatGPT and other AI platforms are often described as sophisticated versions of autocorrect. After digesting vast libraries of human-written text, these systems trained themselves to predict which word would best follow another in a given context. According to this view, this is a great feat of algorithmic regurgitation, but it is not thinking. Linguist and AI skeptic Emily M. Bender and colleagues wrote in an influential 2021 paper that LLM is nothing more than a “probabilistic parrot.” These chatbots may know what word is coming next, but they don’t know it. why That word should come next.
But some of those working most deeply on AI say the answer is not that simple. I recently interviewed Dr. Eyal Klan, an associate professor at the Icahn School of Medicine, for an article. city journal About how artificial intelligence will change healthcare. (In short, most things are going in the right direction, but there may be some pitfalls.) After serving as a doctor in the Israeli Air Force, Klan joined Israel’s famed Sheba Medical Center. There, he became fascinated with AI’s potential to improve healthcare and helped build a cutting-edge AI research center. He currently serves as Director of the Generative AI Research Program at New York’s Mount Sinai Hospital System. In other words, Klang has spent much of his career applying AI tools in fields that require both deep analytical thinking and subtle emotional intelligence.
Until recently, most AI tools used in medicine were machine learning algorithms, each trained to perform a specific task (for example, identifying cancerous lesions in a CAT scan). The advent of open-ended LLMs has changed the game. “These giant models can go anywhere,” Klang told me. LLM chatbots help doctors take notes, summarize complex data, and generate reports. “Everything humans do in front of a computer they tend to do very well,” he said. Klang also believes that large-scale language models can act as virtual colleagues, helping diagnose difficult cases and recommending treatment strategies (with caution, though). This type of diagnostic analysis goes far beyond transcribing an interview or composing an email. Are you thinking?
“I think they think so,” Klan told me. “I think they think, not feel.” He points out that when LLMs train on human-written text, they absorb not only common relationships between words, but also the underlying logic that determines those relationships. In other words, they are learning not only how people speak, but also how we reason. “If they imitate our ideas, they are thinking.” Of course, this perspective opens the door to a vast philosophical debate. If LLM is thinking, who or what is thinking? doing it What are your thoughts? Most of us intuitively feel that thinking requires some kind of self-aware entity capable of intentional thought. But is it? In a 1950 paper, computer pioneer Alan Turing proposed his famous test. If a computer can talk to humans in a way that humans don’t realize they’re talking to a machine, then we can say that computers are at least functionally thinking. Crann is firmly in Turing’s camp: “In reality, there is no meaningful gap between thinking and the emulation of thinking. Both are purposeful symbolic processes aimed at expression and problem solving. If the system can perform those operations consistently, then it is only thinking in the operational sense that matters.”
Klan further argues that the idea that thinking requires self-awareness is dangerous. “If a model could emulate the thought processes of self-aware beings sufficiently convincingly to sustain meaningful dialogue, then the distinction between ‘real’ and ‘simulated’ cognition would become a matter of philosophy rather than function,” he says. See how people actually interact with AI chatbots. We say “please” and “thank you” and generally treat them like sentient beings. Since OpenAI’s ChatGPT-4 debuted in 2023, most users have come to expect that AI can understand forms of communication previously considered uniquely human, such as jokes and sarcasm. Our casual acceptance of the irony of AI “says more about the reality of machine intelligence than any formal proof,” Klan believes.
So if AI chatbots start talking and thinking like humans, and we increasingly interact with them like humans, does that mean we can all relax and trust algorithms? After all, as celebrity magazines used to say, they are Just like us! Not so fast. True, we don’t always trust people to be honest and trustworthy. Why should we trust an AI system that absorbs and reproduces our mental habits, both good and bad?
Anyone who has worked or played with an LLM chatbot knows that sometimes they are just a hoax. This is annoying in any situation, but downright scary when doctors rely on chatbots for medical guidance. In a 2003 paper, Klang and his team investigated what factors tend to induce chatbots into a state now called “hallucinations.” They asked different AI chatbots to analyze a series of medical depictions that included one fictional medical term, such as “Falkenstein syndrome.” In more than half of the responses, the chatbot discussed fictional terminology as if it were real. “He was willing to go and explain in detail some interesting science that didn’t exist,” Crann said with a laugh.
I asked Krang why he sometimes refers to the AI agent as “he.” “It’s hard to talk about thinking like this without anthropomorphizing it,” he told me. Klang also noticed that people seemed to think of their AI interlocutors in the same way that teachers think of smart but sometimes wayward students. Like an ill-prepared schoolboy, chatbots often try to bluff their way around difficult questions. Most LLMs are incentivized to satisfy users, which can lead to some kind of overzealousness, such as pretending to understand made-up medical terminology. But this desire to please also means chatbots are quick to apologize and eager to correct mistakes. (I’ve found ChatGPT-5 to be almost comically sneaky when it gets into an error.) Chatbots also tend to work better when they know users are wary of hallucinations. “When we told them to be extra careful, errors went down by 50%,” Klang said.
In other words, there is something all too human about AI failure. Some of these failures are funny. Others are a little creepy. A number of recent studies have shown that, under the right circumstances, leading large-scale language models can develop strategies to deceive users by giving false answers or hiding information. Research shows that these same models often actively thwart shutdown orders. Researchers speculate that there may be a “self-preservation bias” in the system. oh yeah.
As we learn to coexist with AI, I think we’ll become more comfortable with the concept of these models “thinking.” After all, the LLM is constantly improving. (Clan recently told me that they’re “more like peers than students.”) But we’ll also become more sensitive to AI’s pesky human-like weaknesses. Our tendency to anthropomorphize chatty bots may be more of a help than a hindrance here. Seeing that these agents think and act like humans might remind us that they have the same limitations as humans. Not everyone knows what he’s saying. Not everyone can be trusted. Since the beginning of time, humans have learned to be wary of those who bear false witness. We need to keep those instincts alive.
Photo: Getty Images
We welcome your feedback on this article. Click here to submit a letter to the editor.
