The AI ​​boom is reshaping the cloud business

AI For Business


This voice is auto-generated. Please let us know if you have any feedback.

In the race to deploy enterprise-grade generative AI, three hyperscalers — AWS, Microsoft and Google — offer a glimpse of how large-scale language models and the tools they enable are reshaping infrastructure. I’m here.

Running generative AI at scale requires massive data storage and compute power in addition to the infrastructure, platform, data, and software services already supported by the cloud.

The cloud model has excess capacity built in, but the added volume puts stress on the current infrastructure that hyperscalers are already trying to mitigate.

Alphabet is building Google Cloud data centers and redistributing workloads to accommodate AI computing, the company’s CEO Sundar Pichai said last month during its first quarter 2023 earnings call. rice field.

Microsoft is focused on optimizing its Azure infrastructure to achieve the same goals, CEO Satya Nadella said on its earnings call in March.

However, resource issues are top priority. “Accelerated computing is what is used to drive AI, and what we are very focused on is using these resources very efficiently. ”

AWS announced the next generation of Inferentia2 and Trainium for SageMaker last week. According to Amazon, the new chip addresses the need for more powerful, efficient and cost-effective cloud computing hardware to handle the load of ML inference and model building.

Amazon CEO Andy Jassy touted advances in AI chip technology on his first quarter 2023 earnings call last month.

“The bottom layer here is that all large language models run on compute,” Jassy says. “And the key to that computing will be the chip that powers that computing.”

“Until today,” said Jussie. “I think a lot of the chips out there, especially GPUs optimized for this type of workload, are expensive and scarce. It’s hard to find enough capacity.”

Gartner Vice President Analyst Sid Nag says processing data needs to be fast and requires highly scalable computing power.

“If everyone started building new applications that used generative AI tools instead of traditional SQL, you can imagine what the strain would be on cloud infrastructure,” says Nag.

Compute Drive Cost

In addition to adding capacity, hyperscalers must prioritize efficiency to make the technology affordable.

Cost is already a concern. In a December tweet, OpenAI CEO Sam Altman described ChatGPT’s compute fees as “eye-popping.”

According to a recent National Bureau of Economic Research working paper, a single training run of a typical GPT-3 model can cost as much as $5 million.

According to a CNBC report, New Street Research estimates that the infrastructure price for adding ChatGPT functionality to Microsoft’s Bing search engine could reach $4 billion. Bing currently accounts for less than 3% of the global search market.

The move from raw central processing units to more specialized forms of computing powered by graphics processing units, Google’s AI-accelerated tensor processing units, and new generations of NVIDIA products will create new cost dynamics. Info-Technology Research Group.

“This combination will continue to drive the need to increase the space to accommodate this computing or appropriately adjust the utilization of existing space based on customer and internal hyperscaler demand,” said Young. I’m here.

A challenge is a business opportunity.

According to Young, the size of the platform required to support generative AI capabilities is not a viable in-house option for most companies.

“With most companies moving to the cloud to provide the specialized platforms and raw power they need, there is a new opportunity for hyperscalers to increase their market share,” said Young. said.

According to Nag, investing in chips, servers and other infrastructure to support advances in AI is nothing new.

Moore’s Law, which states that the number of transistors in a chip, and thus its computing power, doubles every two years, remains paramount for hyperscalers.

“Before the AI ​​wave, Google and others were trying to reduce data center costs and use better silicon technology to make data centers more efficient and sustainable,” Nag said. increase.

Training better models and identifying enterprise use cases is a current industry focus. This allows you to fine-tune the technology to suit your specific needs.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *