Runway, one of the first publicly available AI video generation platforms, has launched the third generation of models, which marks a major step forward for the technology and could be one of the best AI video generation tools yet.
Just as OpenAI's end goal is artificial general intelligence, for Runway it's a general-purpose world model: an AI system that can build an internal representation of an environment and use it to simulate events within that environment.
Runway's new model, Gen-3 Alpha, is the closest the startup has come to achieving its long-term goal, and the company says it will power all image and text-to-video tools on the Runway platform, as well as features like Motion Brush and text-to-image conversion.
Runway: How is the Gen-3 different from the Gen-2?
Introducing Gen-3 Alpha: Runway's new base model for video generation. Gen-3 Alpha is capable of producing highly detailed videos with complex scene changes, a wide range of cinematic options, and detailed art direction. https://t.co/YQNE3eqoWf(1/10) pic.twitter.com/VjEG2ocLZ8June 17, 2024
Runway did not say when the third generation would be introduced to replace the current second-generation model, but added that the third generation would include new safeguards, including improved visual moderation and the C2PA standard, which makes it easier to trace the origins of different types of media.
It's the latest in a new generation of AI video models including OpenAI Sora, Luma Labs Dream Machine and Kling, each with longer clips and improved motion.
Runway says Gen-3 is the first in a series of models trained on its new infrastructure, which is built specifically for large-scale multi-modal training, resulting in improved fidelity, consistency, and motion.
One of the lessons we learned from Sora is that scale matters more than most other factors, and adding more compute and data can greatly improve models.
What is Gen-3 like?
This technological breakthrough marks an important milestone in our efforts to empower artists and pave the way for the next generation of creative and artistic innovation. Gen-3 Alpha will be available to all users in the coming days. Prompt: a slow cinematic push… pic.twitter.com/cLaZvGpeu6June 17, 2024
The new model is trained on video and images simultaneously, which Runway says will improve the visual quality of text-to-video prompts.
The new model also lends itself to new tools that give you more control over structure, style, movement, and more.
I haven't had a chance to try out the Gen-3 myself yet, it's still in Alpha mode, but in the video it looks like there's been a big improvement in behavior and compliance with prompts.
Each video will be approximately 10 seconds long, which is roughly twice the length of the Luma default, similar in length to Sora videos, and roughly three times longer than the current Runway Gen-2 videos.
1. Take the train
Prompt: “A woman's subtle reflection in the window of a train traveling at lightning speed through a Japanese city.”
2. Urban Astronaut
Prompt: “Astronauts running through the streets of Rio de Janeiro.”
3. Underwater communities
Prompt: “FPV flight over the colorful coral-lined streets of an underwater suburban neighborhood.”
4. Hot Air Balloon
Prompt: “A nighttime handheld tracking shot of a dirty blue balloon floating above the ground in an abandoned old European street.”
5. The big picture
Prompt: “An extreme close-up shot of ants emerging from their nest. The camera pulls back to reveal a residential neighborhood over the hill.”
6. Down-to-earth people
Prompt: “Zoom in on the face of a young woman sitting on a bench in the middle of an empty school gym.”
7. Fly a drone through a castle
Prompt: “FPV drone captures a clifftop castle.”