Google plans to introduce a new system called Astra later this year, which it promises will be the most powerful and advanced type of AI assistant it has ever launched.
Current generation AI assistants like ChatGPT can retrieve information and provide answers, but that's about it. But this year, Google is rebranding the Assistant as a more advanced “agent,” which the company says exhibits reasoning, planning and memory skills and can take multiple steps to accomplish tasks. .
Oriol Vinyals, Google DeepMind's vice president of research, said people will be able to use Astra through their smartphones and perhaps desktop computers, but the company is also considering other options, such as incorporating Astra into smart glasses and other devices. He said he was doing it. MIT Technology Review.
Don't be satisfied with only half the story.
Access the latest technology news without a paywall.
Subscribe now
Already a subscriber? Sign in
“We are still in the early stages [of AI agent development],” Google CEO Sundar Pichai said in a phone call ahead of Google's I/O conference today.
Demis Hassabis, CEO and co-founder of Google DeepMind said: “So agents can see and hear what we’re doing, better understand the situation we’re in, respond faster to conversations, and make the pace and quality of interactions feel more natural. “Imagine that would be Astra,” he says.
Google's announcement comes a day after competitor OpenAI announced its own supercharged AI assistant, GPT-4o. Google DeepMind's Astra responds to audio and video input in much the same way as GPT-4o (but not as frivolous).
In a press demo, users pointed their smartphone cameras and smart glasses at an object and asked Astra what it was. When that person points the device out the window and asks, “What region do you think I'm in?” the AI system identifies the Kings Cross in London, where Google DeepMind's headquarters are located. I was able to. It was also recorded early in the conversation and I was able to tell that the person's glasses were on the desk.
The demo illustrates Google DeepMind's vision of multimodal AI (capable of processing multiple types of input such as audio, video, and text) that operates in real time, Vinyals said.
“In the future, we're really looking forward to being able to get really close to our users and support them in whatever they want,” he says. Google recently upgraded its artificial intelligence model Gemini to handle even more data. This upgrade allows you to handle larger documents and videos, and have longer conversations.
Technology companies are in the midst of a fierce race for AI supremacy, and AI agents are the latest effort by Big Tech companies to pioneer the frontiers of development. Agents are also part of the story of many technology companies aiming to build artificial general intelligence, a highly hypothetical idea for superintelligent AI systems, including OpenAI and Google DeepMind.
“Ultimately, you want an agent who knows you well, can do more for you, and can work across multiple tasks and domains,” said Chirag Shah, a professor at the University of Washington who specializes in online search. There will be one person.” .
This vision remains ambitious. But today's announcement should be seen as Google's attempt to catch up with its competitors. And by rushing these products out, Shah said Google can collect even more data from his more than 1 billion users about how the models are used and what works. say.
Google is announcing more new AI capabilities today than just agents. We plan to integrate AI even more deeply into search through a new feature called AI Overview. AI Summaries collects information from the internet and packages it into short summaries in response to your search queries. Starting today, this feature will initially be available only in the United States, but will later become available in more countries.
Felix Simon, a researcher in AI and digital news at the Reuters Institute for Journalism, says this will speed up the search process and allow users to get more specific answers to more complex and niche questions. Says. “I think that’s where search always struggles,” he says.
Another new feature of Google's AI search service is better planning. For example, people will soon be able to ask Search for dining and travel suggestions, just as they would ask a travel agent for restaurant and hotel suggestions. Gemini can help plan what you need to do to make a recipe, what you need to buy, and can talk to AI systems to provide relatively mundane things like information about the weather. You can also ask AI systems to perform anything from simple tasks. From predictions to very complex things like helping you prepare for a job interview or important speech.
You can also interrupt Gemini mid-sentence and ask clarifying questions, just like in a real conversation.
In another move against competitor OpenAI, Google also announced Veo, a new video generation AI system. Veo can generate short videos, giving users more control over the cinematic style by understanding prompts like “time-lapse” and “aerial landscape shot.”
Google owns YouTube, so they have a huge advantage when it comes to training generative video models. Collaborations have already been announced with artists such as Donald Glover and Wyclef Jean, who are using the technology to create work.
Earlier this year, OpenA CTO Mira Murati was mum when asked whether the company's models were trained on YouTube data. Douglas Eck, senior research director at Google DeepMind, also responded equivocally when asked about the training his data used to create Veo. MIT Technology ReviewHowever, he said, “Pursuant to our agreements with YouTube creators, we may be trained on some YouTube content.”
Google, on the other hand, is presenting its generative AI as a tool that artists can use to create work, but the tool will likely gain the ability to create that work using existing artists' materials. he says. AI companies such as Google and OpenAI are facing numerous lawsuits from authors and artists who claim their intellectual property has been used without consent or compensation.
“For artists, it’s a double-edged sword,” Shah says.