An AI face appears inside Google’s Beam Lab

AI Video & Visuals


They say that no one but Google has seen what we are about to see. No journalist has ever entered this building. Here at Google’s Mountain View Research Institute, we’re developing life-sized AI agents that can see and talk to you.

This person’s name is Sophie.

can speak several languages. It can see me and almost everything else in the room. You can read it by holding your cell phone, paper, or book over it. And, of course, you can do Google-y things like view maps, get restaurant recommendations, check the weather, and look up quick facts. However, this time you can wear a woman’s face, a black turtleneck, and experiment with her body language.

If only it didn’t feel so false and flat, I think, This could be pretty cool. However, it may not seem fake and flat for a long time.

Image: Matt Piniol/The Verge

Google today unveiled its Beam Video Agent as an experiment, touting it as exploring the future of real-time communication with AI agents using Google Beam.

Beam, you may recall, is the company’s moderately shocking video conferencing hardware that allows people to feel like their conversation partner is right in front of them in stunning 3D without the need for glasses. The first Google Beam product is the $25,000 HP Dimension. 6 cameras don’t actually transmit video someone else’s. Instead, an AI server combines them to create a volumetric 3D projection of the human. This is basically the most realistic video game character I’ve ever seen.

You can now see 3D on your flat screen thanks to this parallax footage that Google helped us capture.

You can now see 3D on your flat screen thanks to this parallax footage that Google helped us capture.
Animation: Google / The Verge

The first time I tried Beam against Beam boss Andrew Nartker. star trek The holodeck begins to dance in front of my eyes. But what happens if you don’t talk to people in the first place? What if It was Will it always be a virtual character?

Unfortunately, Sophie do not have It’s not 3D yet, and “she” isn’t a character yet. At least not with the limited feature set Google has enabled today. Like other second-generation chatbots, Sophie is here to mirror me, look unconvinced and excited by everything I say, and serve as my obedient concierge. Sophie always speaks in tight blocks of text of about the same length after a long pause, beginning with an acknowledgment and ending with a question about which feature she will demo next.

The team says this was intentional, as the demo was created specifically for Google I/O attendees to experience a five-minute demo of what Sophie can do, including creating AI-generated images. If you ask for a bedtime story, you’ll be shown a ridiculous image of a magical contraption being manipulated with the help of a giant fox. (I think my kids will love it too.)

Sophie and generated AI images.

Sophie and generated AI images.
Image: Matt Piniol/The Verge

Not to make too many excuses, but this doesn’t feel like talking to a person. There are too many cracks on the facade. Why does Sophie’s accent keep changing strangely, sometimes turning into a southern-like accent, and then disappearing quickly? Product manager Pavan Kumar says this is meant to be a neutral American accent when speaking in American English, but the AI ​​model seems to be drifting unintentionally. I noticed that Sophie kept making the exact same arm gestures while she was talking. This is probably because this early experiment is built on a speech model. The text drives the voice, the voice drives the lip-synced face, and perhaps gestures add to the gravy.

(Yes, the obedient AI is, as usual, female, and that’s because Sophie has a personality that makes everyone easy to talk to, Naertker says.)

Google Labs' Devika Lal (Group Product Manager), Nick Garoufalis, and Emma van Niekerk will be joining the group call.

Google Labs’ Devika Lal (Group Product Manager), Nick Garoufalis, and Emma van Niekerk will be joining the group call.
Image: Matt Piniol/The Verge

Video agents aren’t Beam’s only new experiment outside of 3D. Google is also showing off group calling in Beam for the first time, and it’s exactly what you’d expect. Googlers can dial in from their laptops or mobile phones just like regular Google Meet. This feature was not in Beam at launch, but has been in development for several years. (Google says it also works with Zoom.)

This should definitely make it easier to justify the purchase of a $25,000 device. It also comes with location audio, making it even easier for Beam users to know who’s talking. However, participants may be smaller than life size, and if there are more than two people on the call, there may be an alternation in who appears on the screen.

The only weird thing here is why Google is calling it an experiment without announcing a release date. For Beam video agents, the experimental tag makes more sense. That’s because this tag isn’t ready yet and Google isn’t 100% sure who it’s meant for yet. However, the company believes the tag could be useful at work, stores, and schools.

Image: Matt Piniol/The Verge

For me, the most appealing possibility is that of a Naruto car. it’s not During the tour, he guided me past a robotic arm designed to test Beam’s head-tracking capabilities and server racks full of Beam boards during accelerated lifecycle testing. Many of them run 10-hour loops every day to make sure they hold up to real-world use.

Naatkar keeps hinting that we are doing something can’t do it But in other parts of the building, you can see things he had his staff clean up. At one point he made sure certain doors were closed before offering us water. When he explained that his digitally rendered body and my digitally rendered body were both 3D meshes residing on a cloud server, it occurred to me that they could now exist in the virtual world as well. Perhaps you can also view it through a headset?

I ask him about my theory. “There are a lot of windows we want to create, from large windows to small windows, and this is a really great first window,” says Nartker.

“You’re already doing VR demos internally, right?” I asked at another point on the tour.

“It’s a big building, Sean!” he teases. He promised to invite me again.

Follow topics and authors See more like this story in your personalized homepage feed, and receive email updates.




Source link