A number of major tech companies, including Apple, Nvidia, and Amazon-backed Anthropic, are reportedly using subtitles from YouTube videos to train their AI models.
Harsh Shivam New Delhi
Apple clarified that the artificial intelligence features it collectively refers to as Apple Intelligence are not run by its OpenELM AI model. As reported by 9To5Mac, the Cupertino-based tech giant said in a statement to the media outlet that “OpenELM is not used for any of the company's AI or machine learning features, including Apple Intelligence.”
This comes after Wired reported that a number of tech giants, including Apple, Nvidia, and Amazon-backed Anthropic, are using material from thousands of YouTube videos, including their subtitles, to train their AI models. The report said that Apple trained its OpenELM model using the plain text of its video subtitles, as well as translations into various languages.
Click here to connect with us on WhatsApp
Google prohibits videos posted to YouTube from being used in applications that are “independent” of the video platform.
In a statement to 9To5Mac, Apple said OpenELM was developed to contribute to the research community and advance open source large-scale language model (LLM) development. The company said OpenELM was not developed to enhance the AI capabilities of its products or devices, but was developed for research purposes only.
Apple Intelligence Training
Previously, Apple said in a research paper published on June 10 that it doesn't use “private user personal data or user interactions” to train its AI models. However, the tech giant said it does use “publicly available data” from the web using the web crawler AppleBot. The company said that if web publishers don't want to allow their web content to be used to train Apple Intelligence, they must opt out.
Apple OpenELM: What is it?
In April, Apple released the OpenELM AI model in its Hugging Face model library. OpenELM stands for “Open-source Efficient Language Models” and is a series of four small language models that can run on devices such as mobile phones and PCs.
The four models in OpenELM have 270 million, 450 million, 1.1 billion, and the largest model has 3 billion parameters. These parameters refer to the number of variables that an AI model can understand from the training data to make a decision.
By comparison, Microsoft's Phi-3 model can accommodate up to 3.8 billion parameters. Similarly, Google's open model Gemma, released earlier this year, offers up to 2 billion parameters.