Why YouTube can give Google an edge in AI

Google last month upgraded its Bard chatbot with a new machine learning model that better understands conversational language and can compete with OpenAI’s ChatGPT. YouTube is a potential trump card for Google as it develops a sequel to that model. From the report: The Google-owned video site is the largest and richest source of images, audio and text transcripts on the Internet. Google researchers are also using YouTube to develop their next large-scale language model, Gemini, according to people familiar with the situation. OpenAI hasn’t lost the value of YouTube. The startup secretly used data from the site to train some artificial intelligence models, according to one of his people with direct knowledge of the effort. AI experts competing with Google say owning YouTube could give the company an edge, giving it full access to video data over video-scraping rivals. This is especially important as AI developers face new obstacles in finding high-quality data with which to train and improve their models. Major his website publishers, from Reddit to Stack Exchange to DeviantArt, are increasingly blocking developers from downloading data for that purpose. Before these walls were put up, AI startups were using data from such sites to develop AI models, according to disclosures from publishers and startups.

The AI advantage of Google’s ownership of YouTube could reinforce concerns about Google’s power among antitrust regulators. On Wednesday, the European Commission launched a complaint against Google’s power in the ad tech space, arguing that Google favors “proprietary online display advertising technology services to the detriment of competing providers.” The U.S. Department of Justice filed a lawsuit against Google in January on similar charges. Google could use audio transcriptions and descriptions of YouTube videos as another source of text to train Gemini, resulting in more sophisticated language understanding and more realistic conversational responses. It can lead to competence. It is also interesting to note that this kind of machine learning model has the potential to integrate video and audio into the model itself, giving it multimodal capabilities, which many researchers believe he is the next frontier for AI. It was revealed in interviews with nearly 10 people he works with. Google CEO Sundar Pichai told investors earlier this month that the in-development Gemini exhibits multimodal capabilities not found in other models, but he declined to provide further details. I didn’t.

Source link