It’s not like I joined the company on January 1st and said, “Let’s start doing press releases,” but that’s how it is.
It certainly is, Satya Nadella, it certainly is. This is the Microsoft CEO’s candid confession about his ChatGPT during the company’s Build developer conference keynote, which should leave no one in doubt that this is his AI company. is.
ChatGPT seems to have become an “overnight” phenomenon, but that’s Nadella’s irony about the January press release. – He argued that it was a much longer play and had some clear precedents.
To put this in a little perspective, in fact, last summer, while playing with DV3 (GPT-4 was then called DaVinci 3), I read Dream Machine by Mitchell Waldrop. I was there. And now I see what this means. .
The Dream Machine concept probably best conveys what we’ve really been doing over the last 70 years. Starting with what Vannevar Bush wrote in his most seminal paper, As We May Think, he had it all. These notions are like associative memory, or Rick Rider, who was the first person to conceptualize human-computer symbiosis, the mother of all the demos that appeared in the Xerox Alto in ’68, and, of course, his PDC that I attended. concept. was his PC Server 1 in his ’91.
93 had a mosaic moment, the iPhone and the cloud. All this becomes his one continuous journey. And in fact, another thing I’ve always loved is [Apple co-founder Steve] According to Jobs, the computer is the bicycle of the mind. It’s kind of a beautiful metaphor, and I think it captures the essence of what computing is all about. And last year he was upgraded in November. With the release of ChatGPT, we moved from bicycles to steam engines. It was like a mosaic moment for this generation of AI platforms.
full speed ahead
According to Kevin Scott, CTO of Microsoft, AI is going full steam ahead now that it’s run off the sidelines of history.
There’s an incredible amount of attention now on what’s going on with these AI models, the rapid progress we call these foundational models, especially the rapid innovation that’s being driven by our partnership with OpenAI. . with Microsoft. We are really setting the pace of innovation in the AI space right now.
It’s amazing to us, too, to see so much of the zeitgeist captured by things like ChatGPT and the applications people are building on top of these massive underlying models. think. The reason OpenAI and Microsoft’s partnership is so successful is that they actually have an end-to-end platform for building AI applications.
We build the world’s most powerful supercomputers, whether they’re home-built and hosted via APIs, or open source and run comfortably on Azure, the world’s most capable underlying models. I have. It also has the world’s best AI developer infrastructure. So, whether you’re using these super-powerful computers to train a model from scratch, or build an application that we’ll talk about at Build this year, on top of the end-to-end infrastructure, the platform.
story so far
As for how we got here, we turned it over to OpenAI President and Co-Founder Greg Brockman to talk about his experience building ChatGPT and GPT-4.
ChatGPT has been a very interesting process from both an infrastructure perspective and an ML perspective [machine learning] perspective. In fact, we’ve been working on the idea of introducing a chat system for years. Build also demoed an early version called Web GPT, which was great. It was a fun demo. We had hundreds of contractors who literally had to pay to use this system, and they were like, uh, this is nice, it helps with their coding work. It might be.”
But the moment that really clicked for me was when I had GPT-4, I had the traditional process GPT-3, I had just deployed the base model, I had just pre-trained it, but I hadn’t trained it yet. bottom. It wasn’t really tuned in any direction, it was in the API. For 3.5, we got to the point where we actually followed the instructions. The contractor was given, ‘Here are the instructions and how this must be completed.
That training was done on GPT-4 and yielded some interesting conclusions, Brockman said.
As a little experiment, I thought, “What happens if I run a second instruction in a row after I’ve already generated something?” The model then returned a perfect response that incorporated everything up to that point. And I found this model to be fully functional. It really puts this idea of ”If you really want me to follow instructions and you give me new instructions, don’t you really want me to have a conversation with you?” was changing.
And for me, that was the moment where I thought, ‘Okay, this infrastructure is already in place in the previous model’ and this new model is just using this kind of technology that’s not for chat. But, as if it would work, I would like to chat. This was a real “Ah!” for a moment. From there, we were just like, ‘I have to put this out. It works. “
As for GPT-4, according to Brockman, it was just a labor of love.
In fact, as a company, we have attempted to exceed the performance of that model multiple times since GPT-3. It’s not easy. And what we ended up doing was going back to the drawing board and rebuilding our entire infrastructure. A lot of our approach was to pinpoint every detail. I’m sure there are still bugs, and more details should emerge, but an analogy from one of his project leads is that this is like building a rocket. We want all engineering tolerances to be incredibly small.
He concluded:
What’s interesting to me is that we’re almost on the TikTok cycle. It means coming up with an innovation and actually driving it. With GPT-4, we are in the early stages of pushing it forward in earnest. We announced the vision feature, but it’s still in production (sic). And I think that will change how these systems work, how they feel, and the applications that can be built on top of them.
If you look back at the history of the last few years, I think two years ago we had a 70% price reduction, but last year we basically had a 90% cost reduction, which is a 10x cost reduction. , that’s crazy, right? I think we will repeat the same thing with the new model. Currently he thinks GPT-4 is expensive and not widely available, but that’s one thing he thinks will change.
my view
Thanks to generative AI and Microsoft, there has never been a pivotal turnaround moment like Bill Gates Paulin’s turn to the Internet 25 years ago. The result overnight was the famous email to employees asking them to refocus all efforts on the Internet in every part of their business. . But almost every announcement coming out of Build this week was about AI. Perhaps the most notable was the deployment of live search results from his Bing to his ChatGPT. This means that whereas the answer was limited to his information up to 2021, users will now have more up-to-date answers available from across the web. But for better or worse, almost everything had an AI thread running on it. The future is ahead.
