Cohere co-founder Nick Frosst, who previously worked on AI at Google, says Altman’s idea of scaling up doesn’t work infinitely. He, too, believes that progress in Transformers, the type of machine learning model at the heart of GPT-4 and its rivals, is beyond scaling. “There are many ways to make Transformers better and more useful, many of which don’t require adding parameters to the model,” he says. Frost says the design or architecture of new AI models and further tuning based on human feedback are promising directions many researchers are already exploring.
Each version of OpenAI’s influential family of language algorithms consists of an artificial neural network. The software is loosely inspired by the way neurons work together and is trained to predict the words that follow certain text strings.
The first of these language models, GPT-2, was announced in 2019. The largest format had 1.5 billion parameters. This is a measure of the number of tunable connections between raw artificial neurons.
At the time, it was huge compared to previous systems.This is also thanks to OpenAI researchers finding that models become more consistent when scaled up.And the company announced in 2020 GPT-3, the successor to GPT-2, was even bigger, with a whopping 175 billion parameters. The system’s wide range of capabilities to generate poems, emails, and other texts has helped convince other companies and research institutions to push their own AI models to similar or even larger sizes.
After ChatGPT came out in November, meme makers and tech commentators speculated that GPT-4 would be a model of dizzying size and complexity when it came out. But when OpenAI finally unveiled its new artificial intelligence model, the company didn’t reveal how big it was. At an MIT event, Altman was asked if his GPT-4 training would cost him $100 million. “That’s all,” he replied.
OpenAI keeps the size and inner workings of GPT-4 a secret, but some of its intelligence may already come from simply beyond scale. It is possible that a method called reinforcement learning with human feedback was used to power ChatGPT. This involves humans judging the quality of the model’s answers and guiding them to provide answers that are likely to be judged to be of high quality.
GPT-4’s astonishing capabilities have taken some experts by surprise and sparked debate about the potential of AI to not only transform the economy, but spread disinformation and eliminate jobs. Some AI experts, tech entrepreneurs including Elon Musk, and scientists recently wrote an open letter calling for a six-month moratorium on development of anything more powerful than GPT-4.
At MIT last week, Altman confirmed that his company is not currently developing GPT-5. “An earlier version of the letter claimed that OpenAI was currently training GPT-5,” he said. “We’re not, and we don’t plan to be for a while.”
