Generative artificial intelligence (AI) is transforming customer experiences in industries around the world. Customers are building generative AI applications using large language models (LLMs) and other foundational models (FMs). This improves customer experiences, transforms operations, increases employee productivity, and creates new revenue channels.
FM and the applications built around it are an extremely valuable investment for our customers. They are often used with sensitive business data such as personal data, compliance data, operational data, and financial information to optimize model output. The biggest concern we hear from customers considering the benefits of generative AI is how to protect their sensitive data and investments. Because the weight of the data and models is so valuable, the customer has to worry about whether it is due to vulnerabilities in his administrator account, the customer, the software running in his environment, or even his cloud service provider. We demand that you maintain your protection, safety, and privacy no matter what. access.
At AWS, protecting the security and confidentiality of our customers' workloads is a top priority. We're thinking about security across her three layers of the generative AI stack.
- bottom layer – Provides tools to build and train LLMs and other FMs
- middle class – Provides access to all models along with the tools needed to build and scale generative AI applications
- upper layer – Contains applications that use LLM and other FMs to create and debug code, generate content, derive insights, and take actions for stress-free work.
Each layer is critical to making generative AI pervasive and transformative.
Using the AWS Nitro System, we achieved a first-of-its-kind innovation on behalf of our customers. The Nitro System is AWS's unparalleled compute backbone with security and performance at its core. Its specialized hardware and associated firmware are designed to enforce restrictions that prevent anyone, including anyone within AWS, from accessing the workloads and data running on Amazon Elastic Compute Cloud (Amazon EC2) instances. It has been. The customer has benefited from this confidentiality and isolation from his AWS operators on all his Nitro-based EC2 instances since 2017.
By design, there is no mechanism for Amazon employees to access the Nitro EC2 instances that customers use to run their workloads or access the data that customers send to machine learning (ML) accelerators or GPUs. This protection applies to all Nitro-based instances, including instances with ML accelerators such as AWS Inferentia and AWS Trainium, and instances with GPUs such as P4, P5, G5, and G6.
The Nitro System enables Elastic Fabric Adapter (EFA), which uses the AWS-built AWS Scalable Reliable Datagram (SRD) communication protocol, to enable cloud-scale, elastic, large-scale distributed training and provides the only always-on cryptographic Enables automated remote direct memory access (RDMA). ) Compatible networks. All communication via EFA is encrypted with VPC Encryption without any performance degradation.
The Nitro System's design has been verified by NCC Group, an independent cybersecurity company. AWS provides a high level of protection for customer workloads, and we believe this is the level of security and confidentiality customers should expect from their cloud provider. This level of protection is so important that we have added it to the AWS Terms of Service to provide additional assurance to all our customers.
Innovate secure generative AI workloads using AWS industry-leading security features
AWS AI infrastructure and services have built-in security and privacy features from day one, giving you control over your data. As customers rapidly move toward implementing generative AI within their organizations, they need to know that their data is being handled securely throughout the AI lifecycle, including data preparation, training, and inference. . The security of your model's weights (parameters learned during training that are important to the model's ability to make predictions) is paramount to protecting your data and maintaining the integrity of your model.
That's why it's important that AWS continues to innovate on your behalf and raise the bar for security across each layer of the generative AI stack. To achieve this, we believe security and confidentiality must be built in across each layer of the generative AI stack. Secure your infrastructure for training LLM and other FMs, securely build with tools to run LLMs and other FMs, and run applications that use FMs with built-in security and privacy you can trust. Must be able to.
At AWS, securing AI infrastructure refers to ensuring that unauthorized persons, either the infrastructure operator or the customer, do not have access to sensitive AI data, such as the weights of AI models or the data processed by those models. Masu. It consists of three key principles:
- Completely isolate AI data from infrastructure operators – Infrastructure operators must not have access to customer content or AI data, such as AI model weights or data processed by the model.
- Ability for customers to separate AI data from themselves – The infrastructure must provide a mechanism that allows model weights and data to be loaded into the hardware while remaining isolated and inaccessible from the customer's own users and software.
- Secure infrastructure communications – Communication between devices within the ML accelerator infrastructure must be secured. All externally accessible links between devices must be encrypted.
Nitro System enables the first tenets of secure AI infrastructure by separating AI data from AWS operators. The second principle provides a way to remove proprietary user and software administrative access to AI data. AWS not only provides a way to do that, but we're making it easy and practical by investing in building an integrated solution between AWS Nitro Enclaves and AWS Key Management Service (AWS KMS). did. With Nitro Enclaves and AWS KMS, you can encrypt sensitive AI data with keys you own and control, store that data wherever you choose, and use the encrypted data for isolated compute for inference. Can be safely transferred to the environment. Throughout this process, sensitive AI data is encrypted and isolated from users and software on the EC2 instance, and AWS operators cannot access this data. Use cases that would benefit from this flow include running his LLM inference in an enclave. Until today, Nitro Enclave only works within the CPU, limiting its potential for larger generative AI models and more complex processing.
We announced plans to extend this Nitro end-to-end encryption flow to include best-in-class integration with ML accelerators and GPUs to meet the third principle. Sensitive AI data can now be decrypted and loaded into ML accelerators for processing, isolating it from your own operators and validating the authenticity of the applications used to process the AI data. Through the Nitro System, you can cryptographically validate your application against his AWS KMS and decrypt your data only if the necessary checks pass. This enhancement enables AWS to provide end-to-end encryption for data flowing through generative AI workloads.
We plan to deliver this end-to-end encryption flow with the upcoming AWS design Trainium2 and GPU instances based on NVIDIA's upcoming Blackwell architecture, both of which support the third tenet of secure AI infrastructure: device-to-device provides secure communications. AWS and NVIDIA are working closely to bring joint solutions to market, including NVIDIA's new NVIDIA Blackwell GPU platform. It combines NVIDIA's GB200 NVL72 solution with Nitro System and EFA technologies to provide an industry-leading solution for securely building and deploying the next generation. Generative AI applications.
Generative AI advances the future of security
Today, tens of thousands of customers use AWS to experiment with innovative generative AI applications and move them into production. Generative AI workloads contain highly valuable and sensitive data that requires some level of protection by your own operators and cloud service providers. Customers using AWS Nitro-based His EC2 instances have had this level of protection and isolation from AWS operators since 2017, when he introduced the innovative His Nitro system.
At AWS, we continue to innovate and invest in building performant, easy-to-access capabilities to help customers protect their generated AI workloads across three layers of the generated AI stack. so that you can concentrate on The best way is to build on the uses of generative AI and expand it to more areas. Learn more about.
About the author
Anthony Liguori is an AWS Vice President and EC2 Distinguished Engineer
Colm MacCárthaigh is an AWS Vice President and EC2 Distinguished Engineer.
