Will the CPU become the new Holy Grail of artificial intelligence?

AI News


Over the past few years, the artificial intelligence revolution has begun to be told through a very simple investment narrative. At the center of this story stands a single protagonist: the GPU. Graphics processors have become the symbol of a new technological era, and their availability has directly determined which companies are able to train the most advanced language models, and which are left behind in the race for the future of AI.

As a result, the market quickly learned to think about artificial intelligence through a single metric: computational power. More GPUs meant larger models, larger models meant better products, and better products meant competitive advantage.

Over time, however, this narrative began to become more complex. It turned out that raw computational power is not enough if the system is unable to deliver data at the required speed. The bottleneck was no longer only GPUs, but increasingly memory as well—both the memory closest to the processor in the form of HBM, as well as traditional server DRAM and the entire data storage and transfer infrastructure.

This was the first moment when investors began to realize that the AI revolution is not a story of a single component, but of an entire technological chain—from silicon, through memory, to networking and cooling systems.

And now, just when the map of this revolution seemed relatively complete, another shift is emerging—far less obvious, but potentially just as important as the previous ones.

An increasingly important role is being played by a layer that for years was treated as “obvious infrastructure”: the CPU. In a world where AI is no longer a single query to a model, but instead resembles a complex system of autonomous agents performing multi-step tasks, not only the scale of computation is changing, but above all its nature.

At this point, a question arises that not long ago seemed secondary. Could the CPU, previously acting as a coordinator and quiet partner to the GPU, actually be becoming one of the key components of the entire AI architecture?

And if so, does this mean we are entering a third wave of the AI revolution after GPUs and memory, in which the key factor will no longer be raw computational power, but rather how well we can connect and orchestrate the entire system as a whole?

THE GPU ERA

At the beginning of the AI revolution, there were not many doubts about where its technological foundation would emerge. With the breakthrough in deep learning and the appearance of increasingly large language models, it quickly became clear that the key limitation was no longer the algorithm itself, but the scale of computation required to train it.

At that point, graphics processors came to the forefront. Their architecture, originally designed for rendering graphics and handling parallel image operations, turned out to be perfectly suited to the type of computation required by neural networks. Instead of a single very fast core, GPUs offer thousands of simpler processing units capable of performing the same operations in parallel across massive datasets.

This is what made GPUs the natural engine of the AI revolution. Training language models, especially those based on transformer architectures, largely comes down to matrix operations—tasks that can be easily parallelized. In practice, this meant that the more GPU power could be concentrated in a single system, the larger the model that could be trained.

As a result, a new standard for compute infrastructure quickly emerged. Data centers began to resemble clusters of specialized accelerators, where CPUs played a supporting role, mainly responsible for data preparation, process management, and communication between system components. All the “heavy mathematics” was moved to GPUs.

This architecture led to a strong concentration of value in a single segment of the market. As demand for computing power grew, GPU manufacturers captured the largest share of the economic value of the AI revolution. Access to GPUs became not only a technological advantage but also a strategic constraint determining the pace of development of entire companies and research labs.

In this setup, the market began to think about AI in a very linear way. More GPUs meant more compute, more compute meant larger models, and larger models meant better products. The logic of this revolution seemed relatively simple and well understood.

Only over time did the first signals emerge that this picture was incomplete.

MEMORY AS THE SECOND WAVE

As AI models grew from millions to billions and then hundreds of billions of parameters, a problem emerged that was not initially as obvious as the lack of compute power. It turned out that simply increasing the number of GPUs does not solve all system limitations if data cannot flow fast enough through the entire compute architecture.

At this point, memory began to move into focus. Both memory directly attached to GPUs in the form of HBM, as well as traditional server DRAM and the entire data storage and transfer layer in data centers. Memory became the factor determining the speed at which increasingly large models could be trained and run.

In practice, this meant that even the most advanced GPUs were unable to fully utilize their potential if they were not properly “fed” with data. The bottleneck was no longer compute itself, but the system’s ability to maintain a continuous flow of information between memory, networking, and accelerators.

This was the moment when the AI revolution began to shift from a purely computational problem to a systems problem. Instead of a single dominant component, we started observing an increasingly complex dependency between different layers of infrastructure.

Memory, previously treated as a supporting element, began to play a strategic role. High-bandwidth solutions such as HBM became one of the key enablers of modern model scaling, and memory manufacturers began to occupy a more important position in the value chain of the AI revolution.

Importantly, this stage did not replace GPUs, but rather revealed their natural limitations. As models scaled, it became clear that compute alone has no value if it is not supported by sufficient data throughput. As a result, the market gradually began to recognize that AI is not a single race for the most powerful processor, but a complex system in which every infrastructure component can become a potential bottleneck.

This is when a more complete understanding of AI as an ecosystem began to form, where alongside GPUs, memory, networking, and data infrastructure all play essential roles.

CPU AND THE RISE OF AGENTIC AI

For a long time, the role of CPUs in the AI revolution seemed relatively stable and well defined. They were responsible for system management, data preparation, and coordinating the work of GPU accelerators, which performed the heavy computational tasks. In this setup, the CPU acted as a quiet infrastructure operator, invisible from the end-user perspective and largely unchanged in its function.

This picture is now changing with the emergence of a new class of AI applications, increasingly referred to as agentic AI. Unlike traditional language models that respond to single prompts, agent-based systems are designed to perform complex multi-step tasks that require not only generating responses but also taking actions within a digital environment.

In practice, this means that instead of a single query and a single response, we are dealing with an entire chain of operations. An agent may start by analyzing a problem, then break it into smaller steps, execute a series of queries to external systems, databases, or APIs, process the obtained information, and only then produce a final result. Each of these steps requires separate system operations, communication with different data sources, and continuous state management of the entire process.

In this new model, the computational burden begins to shift. The language model running on the GPU becomes only one part of a larger system, responsible for language generation and interpretation. The rest—control logic, task management, communication between systems, and handling external tools—increasingly loads the CPU.

This is where a fundamental shift in perspective appears. Previously, the CPU was treated as a supporting layer whose job was to “not get in the way” of GPU computation. In the world of agentic AI, however, the CPU begins to act as an active coordinator that not only manages data flow but also participates in the system’s decision-making process.

Importantly, this is not a cosmetic change but a structural one. Each AI agent performs not one but many computational and operational steps, leading to a sharp increase in operations executed outside the GPU. As a result, the importance of CPU infrastructure grows, as it must handle a massive number of queries, processes, and real-time interactions in parallel.

At this point, the first real architectural shift in AI systems becomes visible. Instead of a model centered around a single type of computation, we are moving toward a multi-layer system in which different components are responsible for different operational roles. GPUs handle matrix computations, memory handles storage and data flow, while CPUs increasingly become the layer responsible for orchestration of the entire process.

CHANGING SYSTEM ARCHITECTURE

With the growing role of agentic AI, not only the way models are used is changing, but the entire architecture of AI systems. The traditional division, where GPUs handled computation and CPUs played a supporting role, is becoming increasingly inadequate for the actual operation of modern AI applications.

Instead of a single computational process, we are increasingly dealing with a system resembling a complex network of cooperating layers. The language model remains the “reasoning” core, but around it grows an extensive infrastructure responsible for memory, data flow, communication with tools, and real-time execution.

In such a setup, the CPU is no longer just a technical support layer for the GPU, but becomes the integrator of the entire system. It now carries an increasing share of responsibilities related to orchestration, state management, and handling complex agent-based processes.

As a result, AI ceases to be a single model and becomes an operating system for intelligent processes, in which different types of hardware play specialized but interdependent roles.

ECONOMICS OF CHANGE

The most important change taking place in the entire AI infrastructure does not concern how systems are built, but how demand for computational resources is distributed. For a long time, the dominant reference point was the CPU-to-GPU relationship, which in traditional AI clusters was heavily skewed toward accelerators.

With the rise of agentic AI, this balance is gradually changing. Instead of an architecture where CPUs play a marginal role and GPUs dominate the system, we are moving toward a more balanced model in which general-purpose processors take on a growing share of workloads related to orchestration, tool handling, and multi-step processing.

This shift has direct economic consequences. As more operations move outside GPUs, demand for CPU power in data centers increases, leading to a higher number of cores required per accelerator. As a result, AI infrastructure becomes more resource-intensive not only in terms of GPUs but also in general-purpose compute.

From a system perspective, this leads to a structural shift in demand across the value chain. Capital expenditure that previously focused mainly on GPUs and high-bandwidth memory is increasingly extending into the CPU segment. This creates pressure on supply chains, increases utilization of manufacturing capacity, and gradually reshapes expectations for the server CPU market.

In this context, the CPU is no longer seen as a mature and stable segment, but rather as one of the key components of AI infrastructure whose importance grows with the complexity of agent-based systems.

CPU MARKET AND KEY PLAYERS

The changing role of CPUs in AI architecture is reshaping the competitive landscape of the semiconductor industry. For many years, the server CPU market was relatively stable and dominated by a single player, but with the arrival of the agentic AI era, it is once again becoming a field of intense technological competition.

At the center of this competition are three main forces: AMD, Intel, and Arm. Each represents a different business model, architecture, and approach to what a modern processor should be in the AI era.

AMD is the most direct beneficiary of changes in the x86 server segment. With its EPYC processors, the company is steadily increasing its market share while offering strong energy efficiency and competitive performance per core. In the context of rising CPU demand in agent-based systems, AMD also benefits from its ability to provide both CPUs and GPUs, building a more complete compute stack for data centers.

Intel, on the other hand, is in a transformation phase. After years of losing share in the server market, it is trying to regain its position through new Xeon generations and a strategy focused on advancing its manufacturing processes. However, Intel’s challenge is not only technological but also strategic—redefining its role in an AI ecosystem that has largely evolved outside its historical strengths.

The third pillar is Arm, which operates at a different level of the value chain. Rather than manufacturing chips, Arm provides the architecture used by hyperscalers to design their own processors. As a result, a growing share of CPU growth does not flow directly to traditional manufacturers but instead to cloud ecosystems that build customized silicon.

This leads to a structural shift. The CPU market is no longer a simple duopoly of Intel and AMD, but a multi-layered ecosystem where hyperscalers such as AWS, Google, and Microsoft design their own processors optimized for specific workloads.

In this setup, there is no single dominant winner. Instead, we observe a market where different business and architectural models coexist and compete for a growing share of compute demand in the AI era.

THE THIRD WAVE OF AI AND ITS IMPLICATIONS

Looking at the entire AI revolution from an infrastructure perspective, a clear pattern of evolution emerges, in which successive layers of the system gradually move from the background to the center of attention. First came GPU compute, enabling modern language models. Then memory moved to the forefront, without which scaling would not have been possible. Now, CPUs are increasingly becoming the next layer in this chain.

This shift is not driven by technological fashion, but by a fundamental evolution in how AI systems operate. The transition from single queries to language models toward agentic AI represents a shift from simple computations to complex multi-step decision processes. In such an environment, the importance of task management, communication with external systems, and parallel operation handling increases significantly.

These are precisely the functions that increasingly load CPUs, which are no longer a supporting layer but an integral component of AI system operation.

This leads to a significant revision of expectations regarding the size of the server CPU market. Projections pointing to growth beyond 120 billion dollars by 2030, and in more aggressive scenarios up to 200 billion dollars, suggest that CPUs are no longer a mature and stable segment, but rather a separate growth cycle driven by AI.

In this new structure, there is no single winner. AMD benefits from rising demand in the x86 segment and strengthens its position as a key AI infrastructure player. Intel attempts to leverage the renewed importance of CPUs to rebuild its position while facing technological and competitive challenges. Arm, meanwhile, captures an increasing share of cloud-based growth, where hyperscalers design custom silicon for specific workloads.

The key point is not to identify a single winner, but to understand that CPU is becoming a third, parallel wave of the AI revolution alongside GPUs and memory. A wave that does not replace the previous ones but complements them, creating a more complete picture of AI infrastructure.

In this view, the AI revolution is no longer a single technological breakthrough story, but a multi-stage process of value redistribution across the semiconductor value chain. And the CPU, long treated as a secondary component, is beginning to occupy a position in this system that few expected not long ago.

 



Source link