This is not investment advice. The author has no investments in any of the stocks mentioned. WCCF TECH INC has a disclosure and ethics policy.
According to a shocking report from 404 Media, Jupiter, an artificial intelligence-based video generation software, trains its models using an extensive database of videos collected from YouTube and other sources. Jupiter is run by Runway AI, Inc., a $1.5 billion startup funded by industry giants such as Google and NVIDIA. The data collected by 404 includes spreadsheets listing YouTube channels of the world's largest media groups and content creators, websites hosting pirated content, and links leading to specific videos.
Anonymous sources told the magazine that the spreadsheet was part of a “company-wide” effort to scrape content from across the internet, and the report is the latest in a series of controversies surrounding companies using creators' data to train their models without paying for it.
Spreadsheet used to feed data to crawlers that download videos via proxies claims source
According to details, the spreadsheet contains links to YouTube channels of some of the biggest names in the media industry, including Netflix, Disney, Sony, Pixar, Vice News, etc. It also highlights content creators such as popular Apple video blogger Marques Brownlee and lifestyle blogger Casey Neistat.
Along with the spreadsheet, 404 Media also reached out to a source who claims to have worked at Runway, whose identity is confidential but who provided key details about how the company used the spreadsheet internally to train its video-generating AI models.
According to them, Runway used the spreadsheet to feed open source software that scraped content from YouTube. In addition to the channels, the spreadsheet included “A company-wide effort to find good videos to build models on“

Runway did not respond to a 404 request for comment, and Google redirected the publication to a statement it made earlier this year. The company said in April that OpenAI's Sora video generator violated YouTube's rules by using YouTube videos to train its models.
Sources said Runway also tasked employees with sifting through videos for keywords that focused on specific types of content. Videos were also categorized by subject, such as animated shorts or student films. The scraping wasn't limited to YouTube, but also included sites hosting pirated content.
Jupiter is Runway's internal code name for its Gen-3 models, and the 404 test prompt appears for Gen-3-generated content similar to the videos allegedly scraped from YouTube, the magazine said, adding that the models stopped generating these videos after reaching out to Runway for comment.
Runway is one of the most popular AI video generation companies. In its Series C expansion funding round last June, the company was valued at $1.5 billion. In the same month, the company was also listed in TIME magazine's “100 Most Influential Companies,” one of more than a dozen AI companies on the list.