Former Operannie scientist Andreji Carpati is “bearish to reinforcement learning” in the long run

File Photo: Andrej Karpathy, a well-known AI researcher and former Openai scientist, said in a post on X that he is “weak to reinforcement learning” in the long run. Photo credit: Reuters

AMED AI researcher and former Openai scientist Andrej Karpathy said in X-Post that he is “weakened to reinforcement learning” over the long term as he turns out to be inefficient and difficult to design. Karpathy, one of Openai's founding members and working on the GPT-4 model, said he believes new learning methods will ultimately replace reinforcement learning, just like human thinking.

“Personally and in the long run, I'm bullish about the interaction between the environment and the agent, but specifically, I'm bearish to reinforcement learning,” he said. He expressed the question that humans used reinforcement learning for most intellectual tasks except “some motor tasks.”

“Humans use a variety of learning paradigms that are already invented and not expanded, although early sketches and ideas exist, they are very powerful and efficient, yet invented and not yet expanded,” he added.

As the current mainstream large language models progressed slowly, there was a revival of reinforcement learning methods, a machine learning training technology used to build AI models.

Karpathy said that past AI training techniques, such as reading texts and imitating examples, will continue to exist, but in the future, models will be learned by living in the environment and interacting with each other.

Published – August 29, 2025 02:09 PM IST

Source link