Gemini AI updates, new search features, and more

Google CEO Sundar Pichai speaks at the Google I/O developer conference.

Andrei Sokolov | Picture Alliance | Getty Images

Tuesday's announcement follows similar events held by AI competitors. Earlier this month, Amazon-backed Anthropic announced its first enterprise product and free iPhone app. Meanwhile, OpenAI on Monday announced a new AI model and a desktop version of ChatGPT, along with a new user interface.

Here's what Google announced:

Google has introduced the Gemini 1.5 Pro update. Its AI models will soon be able to process even more data. For example, this tool can summarize the text of her 1,500 pages uploaded by users.

There's also a new Gemini 1.5 Flash AI model, which the company says is cost-effective and designed for small-scale tasks like quickly summarizing conversations, captioning images and videos, and retrieving data from large documents. It has been.

Google CEO Sundar Pichai highlighted Gemini's translation improvements, adding that it will be available to all developers worldwide in 35 languages. Pichai says that within Gmail, Gemini 1.5 Pro analyzes attached PDFs and videos and provides summaries and more. This means that if you missed a long email thread while on vacation, Gemini can summarize it along with the attachments.

Gemini's new update also helps with searches in Gmail. One example the company gave is that if you're comparing prices from different contractors for roof repairs and you're looking for an overview to decide which one to choose, Gemini offers It may return three estimates and an expected start date. thread.

Google said Gemini will eventually replace Google Assistant on Android phones, hinting that it will be a strong competitor to Apple's Siri on iPhones.

Google has announced its latest model, Veo, for producing high-definition video and the highest quality text-to-image quality that promises lifelike images and “fewer distracting visual artifacts than previous models.” Imagen 3, a model of the company, was announced.

These tools will be available to some creators on Monday, and will also be made available to Vertex AI, Google's machine learning platform that allows developers to train and deploy AI applications.

The company also showed off Audio Overviews, a feature that generates audio discussions based on text input. For example, when a user uploads a lesson plan, the chatbot can outline it. Or you can ask for examples of real-world science problems and get answers through interactive audio.

Separately, the company also showcased AI Sandbox, a set of generative AI tools for creating music and sounds from scratch based on user prompts.

However, generative AI tools such as chatbots and image creators still have accuracy issues.

Prabhakar Raghavan, Google's head of search, told employees last month that competitors “may be coming out with new gizmos that people want to play with, but they're still there because they're a trusted source of information.” People come to Google to check things, and that's becoming more important.” In this era of generative AI. ”

Earlier this year, Google introduced an image generator powered by Gemini. After users discovered the historical inaccuracies and spread them online, the company announced that it was discontinuing the feature and would bring it back in the coming weeks. This feature has not been re-released yet.

The tech giant will launch “AIOverviews” in Google Search in the U.S. on Monday, according to Google search chief Liz Reid. AIOverviews displays simple summaries of answers to the most complex search questions. For example, if a user searches for the best way to clean leather boots, an “AI summary” appears at the top of the results page that includes a multi-step cleaning process gleaned from information synthesized from around the web.

The company said it plans to bring Assistant-like planning features directly within Search. It explains that users can search for things like “create her 3-day meal plan for groups that are easy to prepare” and get a starting point from a wide range of recipes on the web. .

Google is testing a feature that will allow users to ask questions via video, including filming and uploading problems with products they own, regarding its progress in “multimodality,” or integrating more images and videos within its generative AI tools. He said he would start. Then ask the search engine to solve your problem. In one example, Google showed someone filming a broken record player while asking why it wasn't working. A Google search turned up the record player model and suggested that it might be malfunctioning because it wasn't properly balanced.

Another new feature being tested is called “AI Teammate,” which will be integrated into users' Google Workspace. Build a searchable collection of works from your messages and email threads with more PDFs and documents. For example, a would-be founder can ask her AI teammates, “Are you ready to launch?” The Assistant provides analysis and summaries based on information accessible in Gmail, Google Docs, and other his Workspace apps.

Project Astra is Google's latest advancement to the AI assistant being built by Google's DeepMind AI division. For now, this is just a prototype, but you can think of it as Google's goal to develop its own version of JARVIS, Tony Stark's all-knowing AI assistant from the Marvel Universe.

A demo video presented at Google I/O shows the Assistant remembering where the user put their glasses, reviewing code, and answering questions about specific parts of the system through video and audio rather than a chatbot interface. was able to assist. The speaker will be called when it appears on the video.

Google said that for a truly useful chatbot to work, users should be able to “converse naturally and without lag or delay.” Conversations in the demo video took place in real time without any delay. The demo followed OpenAI's Monday showcase showing a similar voice back-and-forth conversation with ChatGPT.

DeepMind CEO Demis Hassabis said on stage that “reducing response times to conversational times is a difficult engineering challenge.”

Pichai said he expects Project Astra to begin on Gemini later this year.

Google also announced Trillium, a 6th generation TPU, or tensor processing unit, which is a critical piece of hardware for running complex AI operations. This is expected to be available to cloud customers in late 2024.

TPUs are not intended to compete with other chips such as Nvidia's graphics processing units. For example, Pichai pointed out at I/O that Google Cloud will begin offering Nvidia's Blackwell GPUs in early 2025.

Nvidia announced in March that Google would use the Blackwell platform for a variety of internal deployments and would be one of the first cloud providers to offer Blackwell-powered instances, and that Google would have access to Nvidia's systems. He said it will help provide large-scale tools for businesses. Developers building large language models.

In his speech, Pichai emphasized Google's “longstanding partnership with Nvidia.” The two companies have worked together for more than a decade, and Pichai has said in the past that he expects the collaboration to continue in the next decade.