In future video games, AI teammates will actually listen to you

AI News


For a while now, gamers have been shouting at screens, barking orders, and venting their frustrations to virtual teammates who can’t hear them. The few games that incorporated voice commands required players to memorize strict menus containing specific phrases.

But at the Nvidia GTC 2026 developer conference earlier this year, French video game giant Ubisoft offered a glimpse of a future where on-screen characters could understand what you’re saying and talk back to you through Teammates. Teammates is an experimental prototype that replaces traditional pre-programmed non-playable characters (NPCs) with teammates powered by generative artificial intelligence (GenAI).

Expanding on Ubisoft’s 2024 Neo NPC project, which was awarded under France’s France 2030 program to promote innovation, Teammates places players in a first-person shooter game with virtual soldiers that respond to natural language, environmental context, and the player’s personal slang.

When you tell a virtual teammate, “Hide behind that car and await my command to shoot the nearest enemy,” your character parses your command, assesses its surroundings, and executes the maneuver while being aware of your strategy.

According to Ubisoft, achieving this level of immersion requires more than a powerful large-scale language model (LLM). The project team had to rethink the inference pipeline to abstract complexity and optimize latency.

complexity abstraction

The goal behind the scenes is not only to build AI-powered games, but also to create a foundation that thousands of artists, writers, and designers can use without any prior experience with AI or machine learning.

“Most game development teams don’t have all the specialized skills needed to update complex GenAI systems,” Joel Gregoire, technical director at Ubisoft Paris, explained in a GTC presentation. “The answer to that question was to build a platform that abstracts complexity and creates games with GenAI capabilities.”

Ubisoft’s solution acts as an agnostic middleware. The platform is built around a C++ software development kit to create gameplay components such as NPC interactions, which are dynamically translated into prompts. Through custom engine plugins, this data is fed directly into Ubisoft’s proprietary Snowdrop and Anvil engines, which convert the raw language model output into engine-specific formats such as facial animation data.

“Think of this as GenAI-agnostic middleware that you can easily plug into your in-house game engine,” said Xavier Manzanares, Director of Gameplay GenAI at Ubisoft. “It creates a whole new opportunity for our team.”

Awkward pause problem

As exciting as the potential for conversational AI in games is, the engineering required to make it compelling is extremely complex. Despite the language fluency, LLMs are notoriously computationally intensive and slow.

Most game development teams don’t have all the specialized skills needed to update complex GenAI systems. The answer was to abstract complexity and build a platform to create games with GenAI capabilities.

Joel Gregoire, Ubisoft Paris

In a normal conversation, humans respond in just a few seconds. When Ubisoft began testing early generative models, it took over three seconds for a character to process the player’s voice, decide on an action, generate a response, and synthesize the voice.

“Creativity starts with quality,” Maxime Sazadaly, technical lead machine learning engineer at Ubisoft Paris, told the Nvidia GTC audience. “But in reality, there is something almost as important as quality: latency.”

A three-second delay in the middle of a virtual gunfight can leave players staring at a blank, unresponsive avatar. “Even if the action is the right one, we can’t recognize intelligence just because it takes too long,” Sazadary said.

To make the characters feel alive, Ubisoft’s engineers determined that the entire loop from when the player speaks into the microphone to when the character reacts had to occur within two seconds, and the team aimed to achieve this in three ways.

  1. Use a faster base model: The team switched from slower models to more efficient ones, using Nvidia’s Parakeet-tdt-v3 for automatic speech recognition (ASR), Gemini 2.5 Flash Lite for cloud LLM inference, and ElevenLab Flash v2 for text-to-speech (TTS).
  1. Stream anywhere: Instead of waiting for the entire response to be generated, Ubisoft implemented partial function analysis. The moment the LLM outputs the first executable function, the data is pushed into the game’s behavior tree and the NPC can start moving. Audio is similarly streamed and stitched together in chunks.
  1. Immediate factorization: By identifying redundant aliases and data within over 10,000 large-scale token recognition prompts, the team reduced prompt size by 30% and significantly improved time to first token (TTFT).

As a result of these optimizations, the team reduced response time to just 1.5 seconds.

Gregoire’s team also established an application programming interface (API) gateway that allows developers to run inference in the cloud, access third-party models or Ubisoft-hosted models via Kubernetes and Nvidia graphics processing unit (GPU) operators, and enable offline play completely on-device to reduce operational costs.

Using Nvidia in-game inference and Cuda-in-Graphics integration, the team was able to successfully deploy Teammate locally on Nvidia RTX 4090 and RTX 5090 high-end consumer GPUs. To stay within the rendering budget of a typical AAA game, we also used a highly optimized small language model (SLM), specifically the 4 billion parameter Qwen3-4B-Instruct-2507 model, quantized as follows: Speed ​​is INT4, quality is FP8, local audio generation is KaniTTS-400m model.

“Today’s high-end hardware and optimized inference stacks allow us to run multi-model GenAI pipelines in parallel with gaming workloads,” said Gregoire. “Moving inference onto the device is the next logical step in making NPC interactions scalable.”

The prototype also includes a Jaspar AI personal assistant that helps players navigate game menus, adjusts the interface for accessibility needs such as colorblindness, and provides tactical advice. Additionally, the game’s AI constantly analyzes player behavior, awarding dynamic achievements based on the player’s play style and providing personalized debriefing at the end of each mission.

It remains to be seen whether Ubisoft’s project will develop further and completely replace the hand-crafted cinematic moments that are a hallmark of blockbuster games. For now, the industry’s biggest companies see AI as a tool to enhance the gaming experience.

“Creativity remains human,” said Yves Guillemot, co-founder and chief executive officer of Ubisoft. “AI provides tools that help bring creative visions to life in new ways. AI is a powerful enabler to create even more meaningful and immersive experiences for players.”



Source link