- Project Astra is an experimental effort to reimagine the possibilities for future AI agents.
- Google will be testing new AI technology at its I/O conference.
- I danced with an AI agent and spoke with Gregory Wayne, the head of Project Astra. he is human
Project Astra was the coolest new technology Google showed off at Tuesday's I/O conference.
After the keynote speech, journalists gathered in the Project Astra demonstration area. This is an experimental effort to reimagine what future AI agents will look like.
“We've always wanted to build universal agents that help people in everyday life,” said Demis Hassabis, head of Google DeepMind.
When we arrived at the test site, Google set up four Journalists enter small demo booths at a time. While we waited in line before entering, his two members from DeepMind's Project Astra team explained how to use the technology.
They explained that there are four modes to try: Storyteller, Pictionary, Freeform, and Alliteration.
Freeform mode
I tried freeform mode. Reuters reporter Max Charney held an Android phone provided by Google and pointed the camera at me.
“Describe his clothing,” Max said.
Google's Gemini model checked the live video on her phone and said I was wearing casual clothes. accurate and certain answer.
I started dancing and Max asked, “What's he doing now?” Gemini, in a way, reacted incorrectly. He was said to be wearing sunglasses. That was true, because I was wearing sunnies for the dance. But no dancing was found. Granted, I wasn't very good at dancing, and the AI model might have picked up on that.
No stock price information
Max then asked Gemini to critique my outfit because he wanted Gemini to analyze and share his perspective.
The AI model said, “We are currently unable to provide you with a stock price.” We all stopped as the AI's magic suddenly ended.
cars and stories
Next, I was moved to a large touch screen where I could try out four modes. I chose Pictionary. I drew a really bad car and Gemini said, “It looks like a car.''
We asked them to put the vehicle into an interesting story. “The sleek blue car sped down the highway like a lone traveler in the moonlight,” Gemini said.
I drove the car into the market, and Gemini said that it was realistically risky to do so, but it might be a good development for the story.
“Really?” Gemini asked.
I drew a table of fruits at the market. It was even worse than the car. Then I said someone should steal the fruit in the story.
“Ah, the fruit thieves add an interesting twist. Have they gotten away with it this far?'' said Gemini.
At this point the demo ended and we were ushered out of the booth.
Gregory Wayne and Captain Cook
Just outside, I met Gregory Wayne, head of Project Astra. He has been with his DeepMind for about 10 years and we discussed the origins of Project Astra.
He said he has been fascinated by how humans use language to communicate. Not only the written and spoken word, but all other forms of communication that make human interaction rich and satisfying.
He told the story of Captain Cook's arrival in Tierra del Fuego and meeting with the inhabitants. Since they did not speak the same language, they communicated through movements such as picking up sticks and throwing them to the side, which showed Cook and his crew that they were welcome.
Wayne and his colleagues were fascinated by this story. Because this story showed all the other ways humans can communicate with each other besides the written word and teaching aloud.
Beyond chatbots
Wayne said this is part of what inspired Project Astra. It's about going beyond what chatbots are currently doing. It's mainly about understanding written and spoken language, and having simple back-and-forth conversations where the computer says something, then the human says it, and then the computer says it again.
One of Project Astra's main goals is to make AI models understand many other things that are happening around text and voice-based communication. It could be a hand signal, or it could be the context of what's happening in the world at the moment of the conversation.
The future could include something like an AI model or agent discovering what's in the background of a video feed and alerting the human in the conversation. It could be letting the user know when a bike is approaching or when a traffic light changes color.
The options are endless, including having an AI model read the room and understand when it needs to stop the conversation and have the human say something.
super evil mega corp
I told Wayne about the slightly disappointing moment when Gemini didn't critique my outfit and instead said he couldn't provide a stock price at this time.
He immediately noticed my T-shirt. It had a logo of a real startup that said “SuperEvilMegaCorp.” Wayne saw the company name and guessed that we wanted to know the company's financial information.
SuperEvilMegaCorp is a Silicon Valley gaming startup that is not publicly traded, so real-time stock information is not available. Gemini didn't know that. Maybe he's learning it now.
