After months of beta testing, leading AI audio company ElevenLabs has released its latest product, a generative sound effect that lets you create any noise you want from a simple text prompt.
I've been using it myself for the past few months and it's getting better and better, it can create incredibly accurate sounds and even background music when pushed.
The company describes it as “transforming imagination into sound,” and says it can create any sound within seconds, from the sound of jumping into water, to coins jingling, to a horse galloping.
In the beta preview, users were given four examples based on a text prompt, each with a unique interpretation by the AI model; in the final release, that number has increased to six.
How do AI sound effects work?
ElevenLabs' sound effects engine, like other generative AI models, requires large amounts of training data and computational time to teach it how to interpret text and turn it into sound.
It's based on ElevenLabs' growing collection of sound-based models, including a highly realistic voice engine that can replicate anyone's voice in just a few minutes.
“Last year, we revolutionized AI Voices by developing the first truly emotive, human-like text-to-speech platform. With the introduction of Text-to-Speech Effects, we're taking another big step forward, giving creators even more audio tools to help them create high-quality content,” said Mati Staniszewski, CEO and co-founder of ElevenLabs, in a statement.
We're also working on a new music model, so in the future you'll be able to create a video and have ElevenLabs add dialogue, sound effects and a soundtrack to create a more immersive experience.
The data for the new sound effects model was created as part of a new partnership with Shutterstock, which enabled ElevenLabs to fine-tune and train the model using the Shutterstock audio library of licensed tracks with accurate labeling of the Shutterstock library.
ElevenLabs Sound Effects Test
To test ElevenLab's sound effects feature, we prepared seven videos containing obvious and not-so-obvious sounds and generated audio for each. The videos were created using Pika Labs from a simple, one-paragraph text prompt.
I gave the same prompts to Pika Labs and ElevenLabs Sound Effects, who worked through all of the prompts and created fuller sounds other than fire, so I had to simplify it down to just “a roaring fireplace” as they kept creating confusing and overly complicated soundscapes.
1. Burning Fire
Prompt: “A cozy fireplace in a rustic log cabin. The flames dance vibrantly, the wood crackles, and sparks fly from the chimney. Warm light illuminates the wooden interior, casting flickering shadows.”
2. Traffic jams
Prompt: “A busy city street with heavy traffic, honking cars and buses passing by. Skyscrapers stand in the background, neon signs flash, and people hurry along the sidewalk. The scene captures the atmosphere of a hectic, noisy urban city.”
3. Ocean Waves
Prompt: “A quiet beach at sunset, with gentle ocean waves lapping on the shore. The sky is bathed in shades of orange and pink, and seagulls fly overhead. The sound of the waves creates a soothing ambiance.”
4. Thunderstorm
Prompt: “A dramatic thunderstorm over a large field with dark clouds, flashes of lightning, and torrential rain. Thunder rumbles, the wind blows violently, and the trees bend, creating a raw and powerful scene.”
5. Crowded Market
Prompt: “A bustling outdoor market in a vibrant city. Rows of stalls selling fresh produce, spices, and handmade goods. Vendors call out to customers, people haggle, and the sounds of chatter and laughter fill the air. The market is colorful, vibrant, and full of energy.”
6. Construction sites
Prompt: “A busy construction site with cranes, bulldozers and workers in hard hats. The sounds of hammers, drills and machinery fill the air as a new building takes shape. Dust rises from the ground and the site is buzzing with activity.”
7. Rock Concert
Prompt: “A rock concert is in full swing, with the band performing on stage beneath colorful lights. The crowd is cheering, clapping, and singing along. The lead singer's voice rings out over the powerful music, and the electric guitar and drums create a thrilling, energetic atmosphere.”
