Meta bets big on AI with custom chips and supercomputers

AI Video & Visuals


Image credit: Bryce Durbin/TechCrunch

At this morning’s virtual event, Meta kicked off an effort to develop an in-house infrastructure for AI workloads, including generative AI like the type that underpins its recently launched ad design and creation tools.

This was an attempt to project Meta’s strengths, which have historically been slow to adopt AI-enabled hardware systems, hindering its ability to keep pace with rivals such as Google and Microsoft.

build your own [hardware] This capability gives us control over every layer of the stack, from data center design to training frameworks,” Alexis Bjorlin, Meta’s vice president of infrastructure, told TechCrunch. “This level of vertical integration is needed to push the boundaries of AI research at scale. ”

Over the past decade or so, Meta has spent billions of dollars hiring top data scientists and building new kinds of AI. That AI includes AI that powers the detection engines, moderation filters, and ad recommenders currently in use across our apps and services. But the company has struggled to commercialize many of its more ambitious AI research innovations, especially in the area of ​​generative AI.

Until 2022, Meta will primarily run AI workloads using a combination of CPUs (which tend to be less efficient than GPUs for this kind of task) and custom chips designed to accelerate AI algorithms. was running. Meta said he canceled a large-scale rollout of custom chips planned for 2022 and instead ordered his Nvidia GPUs worth billions of dollars, but instead replaced them with several large-scale deployments in data centers. A major redesign was required.

To turn things around, Meta planned to start developing a more ambitious in-house chip capable of both training and running AI models. It is scheduled to be released in 2025. And that was the main topic of today’s presentation.

Meta calls the new chip Meta Training and Inference Accelerator (MTIA for short) and describes it as part of a “family” of chips for accelerating AI training and inference workloads. (“Inference” refers to running a trained model.) An MTIA is an ASIC, a type of chip that combines various circuits on a single board to perform one or more tasks in parallel. can be programmed to run as

AI chip Meta custom designed for AI workloads.

“To get better levels of efficiency and performance across our critical workloads, we needed a customized solution co-engineered with our model, software stack and system hardware,” continued Bjorlin. “This will improve the user experience across different services.

Custom AI chips are becoming more and more popular among big tech companies. Google created a processor, his TPU (short for “tensor processing unit”), to train large-scale generative AI systems such as PaLM-2 and Imagen. Amazon offers its own chips to his AWS customers for both training (Trainium) and inference (Inferentia). And Microsoft is reportedly working with AMD to develop its own AI chip called Athena.

Meta says it created the first generation of MTIA, MTIA v1, based on a 7-nanometer process in 2020. The internal memory can be expanded beyond 128MB up to 128GB, and in benchmark tests designed by Meta (although this should be taken with a grain of salt), Meta found that MTIA handled “low complexity” and “medium complexity”. ‘s AI models are more efficient than GPUs.

Mehta said there is still work to be done in the memory and networking areas of the chip, which will become a bottleneck as the size of the AI ​​model grows, requiring the workload to be split across multiple chips. (Not coincidentally, Meta recently acquired an Oslo-based team building AI networking technology at UK chip unicorn Graphcore.) Strictly on inference rather than training of workloads.

However, Meta claims that MTIA, which it continues to improve, has made the company “significantly” more efficient in terms of performance per watt when running recommended workloads, resulting in (ostensibly) Meta He stressed that it would allow for a more hardened and “state of the art” execution. AI workload.

AI supercomputer

Perhaps one day, Meta will offload most of its AI workload to MTIA’s banks. But for now, social networks rely on the GPUs of the Research SuperCluster (RSC), a research-focused supercomputer.

First announced in January 2022, RSC has completed its second phase of construction, assembled in partnership with Penguin Computing, Nvidia and Pure Storage. According to Meta, it currently includes a total of 2,000 of his Nvidia DGX A100 systems with 16,000 Nvidia A100 GPUs.

So why build a supercomputer in-house? Well, first, peer pressure. A few years ago, Microsoft had big plans for an AI supercomputer it built in partnership with OpenAI, and more recently announced a partnership with Nvidia to build a new AI supercomputer on the Azure cloud. Google has elsewhere touted its own AI-focused supercomputer, powered by 26,000 of his Nvidia H100 GPUs, giving it an edge over Meta’s supercomputer.

Meta’s AI research supercomputer.

But Meta says that RSC not only keeps up with the Joneses, but also gives researchers the advantage of allowing their models to be trained using real-world examples of Meta’s production systems. This differs from his previous AI infrastructure for the company, which only leveraged open source and publicly available datasets.

“RSC AI supercomputers are being used to push the boundaries of AI research in several areas, including generative AI,” said a Meta spokesperson. “This is all about productivity in AI research. I thought.”

The company claims that RSC can reach nearly 5 exaflops of computing power at its peak, making it one of the fastest in the world. (To save the impression, it’s worth pointing out that some pundits take the exaflops performance metrics with a grain of salt, and that the RSC far outperforms many of the world’s fastest supercomputers. I have.)

Meta states that they used RSC to train LLaMA. LLaMA is a terrible acronym for “Large Language Model Meta AI”. This is a large-scale language model that the company shared with researchers as a “gate release” earlier this year (and has since leaked to various documents). Internet community). According to Meta, his largest LLaMA model was trained on his 2,048 A100 GPUs and took 21 days.

“By building our own supercomputing capabilities, we will be able to control every layer of the stack: from data center design to training frameworks,” the spokesperson added. “RSC helps AI researchers at Meta build new and better AI models that can learn from trillions of examples. We analyze together, develop new augmented reality tools, and more.”

video transcoder

In addition to MTIA, Meta is developing other chips to handle specific types of computing workloads, the company revealed at today’s event. The processor, called the Meta Scalable Video Processor (MSVP), is the company’s first homegrown ASIC solution designed for the processing needs of video-on-demand and live streaming, Meta said.

As readers may recall, Meta began envisioning a custom server-side video chip many years ago, and in 2019 announced an ASIC for video transcoding and inferencing tasks. This is the result of those efforts and a new push for competitive advantage in the market. Especially in the realm of live video.

“On Facebook alone, people spend 50% of their time on the app watching videos,” Meta Technical Lead Managers Harikrishna Reddy and Yunqing Chen said in a co-authored blog post published this morning. “In order to serve different devices (mobile devices, laptops, TVs, etc.) around the world, a video uploaded to Facebook or Instagram, for example, is transcoded into multiple bitstreams with different encoding formats, resolutions and qualities. Coded… MSVP is programmable and scalable and can be configured to efficiently support both the high-quality transcoding required for VOD and the low latency and fast processing times required for live streaming.”

Meta’s custom chips are designed to accelerate video workloads such as streaming and transcoding.

According to Meta, it will eventually offload the majority of “stable and mature” video processing workloads to MSVPs, and only for workloads that require specific customization and “significantly” higher quality software video. It states that it plans to use encoding. According to Mehta, efforts continue to improve video quality with MSVP using pre-processing methods such as smart denoising and image enhancement, as well as post-processing methods such as artifact removal and super-resolution.

“In the future, MSVP will enable us to further support meta’s most important use cases and needs, including short-form video, enabling efficient delivery of generative AI, AR/VR, and other metaverse content. ,” Reddy and Chen said.

AI focus

If there’s one thing all of today’s hardware announcements have in common, it’s that Meta is desperate to pick up the pace when it comes to AI, especially generative AI.

Similar things have been telegraphed before. In February, CEO Mark Zuckerberg reportedly made improving Meta’s computing power for AI a top priority, but in his words, the company’s research Announced a new top-level generative AI team to “turbocharge” development. CTO Andrew Bosworth similarly said that the area where he and Zuckerberg have spent the most time lately is generative AI. And chief scientist Yann Lucan said Meta plans to introduce generative AI tools to create items in virtual reality.

At Meta’s Q1 earnings call in April, Zuckerberg said, “We’re looking at chat experiences on WhatsApp and Messenger, visual creation tools for posting and advertising on Facebook and Instagram, and gradually video and multimodal experiences. there is,” he said. “We expect these tools to be of value to everyone, from ordinary people to creators to businesses. Over time, this will extend to our work in the metaverse, where people can more easily create avatars, objects, worlds, and the code that ties them all together. will be able to.”

Meta felt mounting pressure from investors who worried the company wasn’t moving fast enough to capture the (potentially large) market for generative AI. This is also a factor. There is no answer yet for chatbots like Bard, Bing Chat and ChatGPT. Nor has there been much progress in image generation, another important area experiencing explosive growth.

If forecasts are correct, the addressable market for generative AI software could total $150 billion. Goldman Sachs forecasts a 7% boost to GDP.

Even a fraction of that could offset the billions of dollars Meta lost in investing in “metaverse” technologies like augmented reality headsets, conferencing software, and VR playgrounds like Horizon Worlds. increase. Reality Labs, the division of Meta responsible for augmented reality technology, reported a net loss of $4 billion in the last quarter, and the company said in its first-quarter call that it expects “an increase in operating losses in 2023 year-on-year.”





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *