Runway may be using pirated videos to train video models

AI Video & Visuals


Another AI company making the rounds in the tech industry, Runway, has a powerful video generation model called Gen-3 Alpha. While this model is powerful, some are unhappy with how the company sourced the videos to train the model. A new report claims that Runway may have pirated a ton of videos to train its AI model, including YouTube videos.

Let's not pretend we don't know. Almost all of the media you see on the internet is probably being scraped and used to train AI models. This includes articles, books, social media posts, images, podcasts, videos, etc. Companies are scraping all of this content right under our noses and no one knows until stories like this start to surface. It's so sad.

A few months ago, there was a bit of a fuss over whether OpenAI scraped data from YouTube to train its video generation tool, Sora, suggesting that YouTube and Google do not tolerate companies scraping data from YouTube. The spat has since died down.

Runway may be using pirated videos to train its AI models

Runway's model is impressive, but it requires a ton of video data to train it. That video data has to come from somewhere, and 404 Media has revealed where it's getting it from. The company discovered a spreadsheet with links to a ton of YouTube channels, including Mr. Beast, MKBHD, The Try Guys, Nintendo, BuzzFeed, Netflix, Linus Tech Tips, Sam Kolder, and more.

Runway didn't stop at YouTube. The spreadsheet also contains links to sites such as KissCartoon, a piracy site. In total, the spreadsheet contains about 4,000 links. Each row in the spreadsheet contains information about a YouTube channel, such as the number of videos and the content they create.

The company reportedly used crawlers to actually download these videos and pull them into its models. As if that wasn't bad enough, Runway also allegedly used proxies to avoid detection by Google, meaning the company knew Google would be upset about scraping their video data.

We don't know how much of the data in the spreadsheet was actually used to train the model, and unfortunately, we may never know.

Legal implications

This could have some pretty heavy legal consequences: Companies like Microsoft and OpenAI have already been dragged to court for scraping data from The New York Times, and YouTube may have legal grounds to sue the company depending on how much raw video data Runway scrapes.

The list also includes YouTube channels from major companies like Disney, Netflix, and Nintendo, which are sure to have some copyrighted videos on their channels. History has taught us that messing with Nintendo could land you in trouble.

Finally, you may have downloaded the video from a pirated website, which is clearly a violation of the law.

Now that this information has been released, we can only wait and see what happens with the company and its video model.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *