Studio quality and speed
Developers can now choose from two different model variants designed to meet specific operational and latency requirements.
- Lyria 3 Pro (lyria-3-pro-preview): The premier model for full-length song generation allows you to create tracks up to approximately 3 minutes. These tracks feature professional-level structure recognition and are the standard for premium, studio-quality output.
- Lyria 3 clip (lyria-3-clip-preview): Optimized for speed and high volume requests, this variant produces high-quality 30-second clips. It’s an ideal choice for rapid prototyping, background loops, and social media assets.
Both models deliver realistic vocals with expressive nuances, as well as improved clarity and support for a more natural sound. Developers can also explore global languages and genres. Generate vocals in different languages and create music spanning genres from pop to funk to Motown.
Precision control and multimodal input
Lyria 3 introduces granular controls that allow you to precisely direct your model through natural language prompts.
- Tempo adjustment: Set a specific tempo (fast, slow, etc.) with high precision to ensure that the music matches the rhythm of your application.
- Time-aligned lyrics: Prompts can outline the progression of a song and control when lyrics begin and end within a track.
- Multimodal image-to-music input: Lyria 3 supports multimodal input in addition to text. You can provide images that influence the feel, style, and mood of your audio.
Luria 3 in action
To show you how to incorporate this model into your application, we built some examples in Google AI Studio.
- Video background music: This demo app allows users to upload videos and have them analyzed in Gemini 3 Flash to generate descriptive prompts for custom soundtracks. Lyria then uses this prompt to create a matching instrumental that serves as the video’s synced background music.
