Amid last week’s controversy over AI over regulation, apocalyptic fears, and job turmoil, the clouds parted briefly. For a split second, we can enjoy an AI-generated video of Will Smith eating spaghetti, which is absolutely ridiculous of her.
On Monday, a Reddit user named “chaindrop” shared an AI-generated video on the r/StableDiffusion subreddit.that Posted on other forms of social media and inspired various ruminations in the press. For example, Vice said the video “will haunt you for life” and The AV Club called it “the natural end point of AI development.”
we are in the middle. The 20 second silent video consists of 10 independently generated 2 second segments of his strung together. Each shows a different angle of a simulated Will Smith (at one point, two Will Smiths as well) voraciously devouring spaghetti. Thanks to AI, it is completely computer generated.
And you will see it now:
“Did you see this kind of advanced deepfake technology in 1987?the running manNo, Jesse “The Body” Ventura defeated a fake Arnold Schwarzenegger in a cage match on a dystopian game show set between 2017 and 2019.
This feat is made possible by a new open-source AI tool called ModelScope, released a few weeks ago by the DAMO Vision Intelligence Lab, the research arm of Alibaba. ModelScope is a ‘text2video’ diffusion model trained to create new videos from prompts by analyzing millions of images and thousands of videos scraped to LAION5B, ImageNet, and Webvid datasets. This includes videos from his Shutterstock, so you’ll see a ghostly “Shutterstock” watermark on the output.
AI community HuggingFace currently hosts an online demo of ModelScope, but it requires an account and you have to pay for compute time to run it. We tried to use it, but it was overloaded, probably due to Smith’s spaghetti mania.
According to chaindrop, the workflow for creating the video was very simple. Give ModelScope a prompt “Will Smith is eating spaghetti” and generate it at 24 frames per second (FPS). Chaindrop then used the Flowframes interpolation tool where he increased the FPS from 24 to 48, then halved the speed to make the video smoother.
Of course, ModelScope isn’t the only game on the emerging field of Text2Video. Recently, Runway debuted “Gen-2”. We previously covered an early text2video research project by Meta and Google.
Ever since Will Smith eating spaghetti became a viral hit, the internet has been graced with follow-ups like: Joe Biden eating spaghetti. There’s also Smith eating meatballs, which is probably actually a really horrifying video.
Of course, if the output of these text2video tools becomes too realistic, there will be other issues to deal with. Perhaps a deep social and cultural issue. But for now, let’s enjoy ModelScope’s imperfect and terrifying glory. We apologize in advance.