Private AI computing enables Google inference with hardware isolation and ephemeral data design

Google announced Private AI Compute, a system designed to process AI requests using the Gemini cloud model while maintaining the privacy of user data. The company describes Private AI Compute as a technology built to “unleash the speed and power of Gemini cloud models for AI experiences,” claiming that it “gets faster, more helpful responses, making it easier to find what you need, get smart recommendations, and take action.” The announcement positions Private AI Compute as Google’s approach to addressing privacy concerns while delivering cloud-based AI capabilities based on what the company calls Privacy Enhancement Technology (PET), which it has developed for AI use cases.

Google has designed private AI computing with multiple layers of protection for processing. The system uses AMD-based hardware Trusted Execution Environment (TEE) for CPU and TPU workloads to “encrypt and isolate memory and processing from the host.” The company extended its Titanium hardware security architecture to TPU hardware, starting with the 6th generation Google Cloud TPU, known as Trillium, to meet the requirements of Private AI Compute. This architecture also establishes encrypted communication channels between trusted nodes using protocols such as Noise and Application Layer Transport Security (ALTS). Google has certified that it verifies the integrity of trusted nodes as part of establishing these encrypted channels, which the company says protects user data from the broader Google infrastructure.

Source: Chain of Trust for Private AI Computing

Private AI Compute includes protections designed to address privileged access abuse. The system works on a temporary basis, meaning “inputs, model inference, and calculations are retained only as long as necessary to satisfy a user’s query,” so attackers cannot access historical data, Google said. Key services run on a confidential computing platform based on AMD’s hardware Trusted Execution Environment (TEE), and front-end services run in confidential virtual machines. Google says this approach protects guest virtual machine workloads from the host and validates code through attestation. The system uses IP blinding relays operated by third parties to tunnel traffic to private AI computing. Google claims this removes the ability to link a user’s IP address or network identification to a specific query.

Private AI computing allows on-device functionality to access enhanced functionality while maintaining privacy protection. Google says the technology makes Magic Cue “more useful with more timely suggestions” on the latest Pixel 10 smartphones. The Pixel’s Recorder app uses Private AI Compute to “summarize transcriptions across a wide range of languages,” according to the company.

Private AI computing reflects a broader industry trend toward privacy-focused AI systems. Apple’s Private Cloud Computing and Meta’s Private Processing pursue similar goals of offloading AI workloads to the cloud while implementing cryptographic and hardware-based protections.

As one Hacker News commenter put it:

Aside from the obvious risk that the TEE manufacturer holds the keys and may share access with others if forced or willing, there are several research papers detailing how trusted execution environments can be attacked.

As an external auditor, NCC Group verified that Private AI Compute’s system design met privacy and security guidelines. The audit includes an architectural review of private AI computing systems, a cryptographic security assessment of Oak session libraries, and a security analysis of IP blinding relays.

Developers interested in private AI inference can consider OpenPCC, an open source framework available on GitHub. This repository provides technical details for those who want to explore or experiment with private AI architectures.

Source link