At Data Center World 2023, Omdia’s Vlad Galabov and AMD’s Kumaran Siva discussed processors and AI, specifically how they affect performance, cost, power consumption, and sustainability.
This is part 2 of our conversation on DCW. Click here for Part 1.
Transcription follows. Minor edits have been made for clarity.
Transcription:
Vlad Garabov: We’re talking about one of the very hot topics – modern process nodes. It is AI that correlates with process nodes and with compute performance. Can you talk about what people feel is the most useful feature of a processor? In our industry, we talk so much about matrix multiplication that we tend to stick to benchmarks. But the reality is that people don’t run benchmarks on servers, they use real workloads. So what aspects of AI application performance were found to be affected by what aspects of the processor?
Kumaran Shiva: So if you look at AMD processors today, they have hyperscalers deployed to run AI while we’re talking. There are some big companies talking about leveraging AI. For example, Tencent and WeChat discussed how they leveraged their recommendation engine solution. And many other companies are using AMD’s silicon for that purpose.
Where we excel is in offering the best general purpose processors. Looking at the inference, it’s a series of pipeline steps. So you have to get the data, preprocess the data, then perform inference and do something with the result. So the entire pipeline itself helps to have a scalable and highly efficient core. And that is an advantage that many of our customers can take advantage of.
For example, if you just focus on benchmarking or matrix multiplication, you can probably find a more efficient solution. Ultimately, though, it dilutes to the point where it doesn’t matter. And that’s one of the keys to AI on both general-purpose CPUs, especially AMD architectures.
Vlad: Interestingly, I attended Tencent last week and one of the things they were looking at was how they could innovate to achieve sustainability, how to reduce the power consumption of all their workloads. was to find I think what you’re pointing out is that it comes with the materials in the rack, the servers you use, and the CPUs. And most interestingly, we are also looking at ways to reduce the power consumption of the physical infrastructure within our data centers. Their early experiments with liquid cooling have yielded some very nice synergies in terms of efficiency and running one of the hottest workloads in the world.
So to sit down and talk about computing in 2023, you can’t help but talk about the elephant in the room. One of the things that has taken the world by storm since this time last year when we held our first Omdia Analyst Summit is generative AI. ChatGPT has gone mainstream. I was hired very quickly. One of the things he’s done a lot as an analyst for us is investigate how training is done on ChatGPT. It’s pretty well documented. It uses high-performance computing clusters and an architecture that most of us are familiar with.
But from what we’ve seen as analysts, the big challenge is how to turn a solution like ChatGPT into a commercially viable product, and how to do it in a cost-effective manner. That is, to perform inference in a cost-effective manner, that is, to perform inference fast. It’s a sustainable, yet energy-saving method. Can you talk about AI inference as a workload? What do you think the requirements are? How do we best perform AI inference? Because I think it’s not just about computing . It’s also a matter of cost. Power efficiency and sustainability are also important.
Kumaran: Absolutely. So here are some thoughts. AMD is a broad semiconductor provider, so naturally it also has GPUs. In short, we are involved in AI trends and generative AI. So we are involved here in different aspects. One thing to think about again from a CPU perspective is the end-to-end picture. You don’t just have to return a response, you also need to analyze and preprocess it. All of this adds up, and so does the actual reasoning itself. CPUs therefore offer a unique value proposition.
In the actual reasoning itself, we partnered with people working on sparsity. For example, one company, Neural Magic, showed great results on Genoa CPUs. This is up to 1,000x better than off-the-shelf, non-optimized Onyx runtime type code. And this probably opens the door to thinking about how to do generative AI inference on the CPU as well.
But in a broader perspective, AI is becoming part of the toolbox used by programmers. The headlines are picked up by his ChatGPT, but in practice small and medium models are starting to emerge in everyday programming as well. They are integrated as part of your code flow and are naturally incorporated into your application.
I think this is one of the ways companies are going to adopt AI, for example. The ISV application itself has small and medium models that are only used to help analyze data and better visualize data. So even Microsoft Office has a little sidebar with different recommendations and what the slide looks like, and you know it’s probably a little AI model. So that sort of thing is starting to seep into user interface design and data analysis. Those elements will start to multiply.
Vlad: Oh, absolutely. One of the things we’ve discussed with our colleagues who specialize in programming AI computers is that the boring AI we’re not talking about actually has enormous business value. They have several case studies of how much money retailers have saved by implementing a very simple AI model. So, yes, we’ve talked about the headlines for the most amazing things, but I think there’s a lot of untapped business value in very simple AI models that can easily run on mobile phones. So it certainly has a very efficient processor.
Kumaran: Absolutely.
Vlad: We are very excited to hear more about AMD’s commitment to sustainability and how they can use their CPU innovations to transform the fabric of the data center. We look forward to welcoming you to the Omdia Analyst Summit. Thank you for your time.
Kumaran: Thank you very much, Vlad. It was a pleasure.
