Midjourney, an image generation tool best known for running inside Discord servers, is spreading its AI wings: Midjourney's developers announced Tuesday that they plan to introduce a “text-to-video” model in the coming months.
The company will begin training video models in January, CEO David Holtz said during a Discord “Office Hour” session, a move that represents a natural progression for the platform, building on mature image models to help fuel a competitive edge in the generative video industry.
Notes from the Discord session included planned tweaks to V6 Niji, Midjourney's manga/anime generator model, as well as consistency fixes for the upcoming Midjourney V6. The company also noted on the to-do list that they will “start training a new video model,” which they said could be ready “within a few months.”
No further details about the model were shared by Holtz or the Midjourney team.
Midjourney is known for prioritizing quality and user experience over raw speed, even if it means falling behind competitors. The company rolled out enhancements like inpainting and outpainting months after the features had become de facto standard on other platforms like Stable Diffusion, and its recent foray into basic text generation came after it had become a common feature on other models like Dall-E 3, SDXL, and even less popular generators like Ideogram and IF.
Entering a competitive field
The move into video also comes on the heels of competitor releases: Stability AI recently announced Stable Video Diffusion, Meta just unveiled its EMU video generator, and Midjourney's entry puts it on a hotly contested playing field, as existing models like Pika and Runway ML are establishing themselves in a space of their own. Additionally, other image generators like Leonardo AI have already implemented video generation capabilities, further increasing the competition.
Midjourney's recent version 6 update, which features improved prompt tracking and more realistic imagery, is the company's latest effort to stay competitive. If the company can show consistency in its model, it could cement its place in this emerging field, even if it's still far from perfect.
The impact of these developments goes far beyond corporate dominance. As Midjourney and other companies innovate and improve, the creative and media industries are on the brink of a transformative era. The ability to generate, manipulate, and interact with video content through AI opens up a host of possibilities, from making things easier for entertainers and advertisers to completely transforming our perception of reality.
Editor: Ryan Ozawa.
Generally highly intelligent Newsletter
A weekly AI journey narrated by generative AI model Gen.
