Rethinking infrastructure for the AI ​​era

Applications of AI


meta

43 minutes ago

Santosh Janardhan, Vice President and Head of Infrastructure Division

As we push the boundaries of AI research, deliver more cutting-edge AI applications and experiences across our app family, and build on our long-term vision for the Metaverse, the need for artificial intelligence (AI) computing will continue to grow over the next 10 years. expected to increase dramatically over the years.

We are executing an ambitious plan to build Meta’s next-generation AI infrastructure, and today we share details about our progress.

This includes the first custom silicon chip to run AI models, a new data center design optimized for AI, and a second phase of a 16,000 GPU supercomputer for AI research. These efforts, along with additional ongoing projects, will enable us to develop larger and more sophisticated AI models and deploy them efficiently at scale. AI is already at the core of our products, better personalization, safer and fairer productsand richer experience At the same time, we help businesses reach the audiences they care about most.

We are reimagining the way we code by introducing CodeCompose, a generative AI-based coding assistant built to improve developer productivity throughout the software development lifecycle.

By reimagining how we innovate across our infrastructure, we are building a scalable foundation to drive new opportunities in areas such as: Generation AI and metaverse.

AI at the heart of infrastructure

Since breaking ground on our first data center in 2010, we’ve now built a global infrastructure that powers the more than 3 billion people who use our family of apps every day. AI has been an important part of these systems for many years. big sur Towards hardware development in 2015 pie torch and our AI research supercomputer.

Today, we are evolving our infrastructure in exciting new ways.

  • MTIA (Meta Training and Inference Accelerator): This is an in-house custom accelerator chip family targeted for inference workloads. MTIA It offers greater compute power and efficiency than CPUs and is tailored for in-house workloads. Deploying both MTIA chips and GPUs improves performance, reduces latency, and increases efficiency for each workload.
  • Next-generation data center: Our next-generation data center design enables both training and inference on future generations of AI hardware while supporting our current offerings. This new data center will be an AI-optimized design, supporting liquid-cooled AI hardware and a high-performance data center. An AI network that connects thousands of AI chips for data center-scale AI training clusters. It’s also faster and more cost-effective to build, and complements other new hardware, including our first in-house developed ASIC solution. MSVPs, It is designed to power the ever-growing video workloads on Meta.
  • Research SuperCluster (RSC) AI supercomputer: the meta RSCMoreConsidered to be one of the fastest AI supercomputers in the world, it was built to train the next generation of large-scale AI models that power new augmented reality tools, content understanding systems, real-time translation technology, and more. rice field. It has 16,000 GPUs, all accessible across a three-level Clos network fabric that provides full bandwidth to each of the 2,000 training systems.

Benefits of an end-to-end integration stack

By custom-designing much of our infrastructure, we can optimize the end-to-end experience from the physical layer to the virtual layer to the software layer to the actual user experience.

We design, build and operate everything from data centers to server hardware to the mechanical systems that run it all. It controls the stack from top to bottom, so it can be customized to your specific needs. For example, you can easily place GPUs, CPUs, networking, and storage if they can better support your workload. As a result, when you need different power and cooling solutions, you can rethink their design as part of a cohesive system.

This will become increasingly important in the years to come. The next decade will see increased specialization and customization in chip design, dedicated workload-specific AI infrastructure, new systems and tools for large-scale deployment, and greater efficiency in product and design support. All of this provides increasingly sophisticated models built on the latest research and products that make this new technology accessible to people around the world.

We are always focused on delivering long-term value and impact to guide our vision of infrastructure. We believe our track record of building world-class infrastructure positions us to continue to lead the AI ​​space for the next decade and beyond.

Click here for details Investing in AI.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *