OpenAI's Sora AI video generator is a great technology that can create stunning visuals and complex clips, but it may not be ready to replace Hollywood just yet.
One of Sora's most high-profile uses since it was announced in February was a short film called “Air Head” by Canadian production studio Shy Kids. It features a man with a balloon on his head who literally has an air head and talks about his life and problems.
Although it is clear that the audio was human-generated and the video was edited using Sora clips, it was implied that the clips were “as produced” by Sora.
Apparently not.In an interview with FX guide Patrick Cederberg from the Shy Kids team revealed that they had to rely on traditional VFX techniques to solve consistency issues with the shots generated by OpenAI's tools.
Overall, this isn't really an issue and probably reflects how AI video is used in the filmmaking process, but the lack of transparency doesn't represent OpenAI well.
How was Air Head actually created?
OpenAI has been reluctant to provide large-scale access to its AI video generator. This is partly because of the cost and time it takes to produce a single clip, but also because of the potential for very high quality synthetic content to be used in an election year, so safety issues are also a consideration. It has been.
Shy Kids was one of more than a dozen creative professionals who offered the opportunity to test drive Sora, which is accessed through a ChatGPT-like interface and has copyright guardrails in place.
Ceberberg, who was in charge of post-production on Air Heads, told FXGuide: But I think that's the case with any generative AI tool. At the moment, control is both the most desirable and the most elusive. ”
Examples of this include unwanted faces appearing on balloons, or yellow balloons appearing red. One of his most notable scenes, where a character chases a head across a courtyard, required the use of rotoscoping in Adobe After Effects.
In Sora's clip, there was a man's head and the balloon was red. To solve it, I painted the head and changed the balloon's color in his AfterEffects since I couldn't render it exactly as I wanted in Sora.
What is it like to work with Sora?
According to Cederberg, this is not a quick process. Regardless of the length of your clip, rendering can take up to 20 minutes, or longer if demand on server time is high.
I tweaked the timing quite a bit to keep everything from feeling like a big slow motion project.
Patrick Cederberg “Shy Kids”
“If you give me the full 20 seconds, I usually do that because I have more opportunities to cut and edit the content and hope that I'll have a better chance of getting something that looks good,” he says. told FXGuide.
Sora is accessed through a ChatGPT interface, and prompts are refined by an OpenAI chatbot before being sent to a video generator. He said it's often necessary to use long, highly descriptive prompts to ensure consistency, but that's not always possible.
To make it work, they worked to make Air Head more like a documentary than a short film. Essentially, rather than writing a script and shooting around that script, they worked with a large amount of generated material and crafted it into a story. Because you don't always know what kind of shot you can create.
Sora also seems to suffer from the same problem as existing AI generators like Runway and Pika Labs, in that the clips appear significantly slower than the actual footage. “We had to adjust the timing quite a bit to keep it from feeling like a big slow-motion project,” Cederberg said.
What does this mean for AI video?
Remember the “air head” video we made with Sora? It turns out there was a lot of rotoscoping and manual VFX used. The “head” grows back and the balloon's color continues to change with each generation. TL;DR Researchers and developers of generative AI tools really need… pic.twitter.com/nHP5nCe0PPApril 25, 2024
In fact, for some time now AI-generated content will be used by filmmakers as part of their workflow, rather than as a replacement for filmmaking itself. The work Adobe has done to integrate generated video into Premiere Pro is a good indicator of what's to come.
The Shy Kids experience is perhaps the pinnacle of how AI video can be used. It also echoes some of the comments from LA-based director Paul Trillo, who used Sora to create the next generation of TED Talks promotional videos. I also mentioned that it takes a lot of clips to get the desired output, and 10 clips will produce hundreds of clips and it will work fine.
Cederberg sees Sora not only as an extension of his regular process, but also as a complementary VFX tool. Adobe suggests using this to generate B-roll or extend existing clips.
Along with the rise of digital VFX and other breakthrough technologies, Sora and AI video will usher in a new generation of movies and perhaps a new golden age of cinema.
