AI-powered glasses can read unvoiced speech

AI News

It may sound like Ruidong Zhang is talking to himself, but in reality, a computer science PhD student unlocks a nearby smartphone and plays the next song in his playlist. silently says the passcode.

It’s not telepathy. These are the seemingly normal, off-the-shelf glasses he wears called EchoSpeech. EchoSpeech is a silent speech recognition interface that uses acoustic sensing and artificial intelligence to continuously recognize up to 31 unspoken commands based on lip and mouth movements. .

Ruidong Zhang, PhD student in computer science, wearing EchoSpeech glasses.

Developed by Cornell University Smart Computer Interface (SciFi) Lab for Future Interactionsthe low-power wearable interface requires only minutes of user training data before it can recognize commands and execute them on a smartphone, researchers say.

Zhang is the lead author of .EchoSpeech: Continuous Silent Speech Recognition in Minimal Discreet Eyewear Using Acoustic Sensingwill be presented this month at the Conference on Human Factors in Computer Systems (CHI) in Hamburg, Germany.

“For people who cannot produce sounds, this silent speech technology could be a good input to a speech synthesizer. It might be possible to get the patient’s voice back,” Zhang said. Talk about potential uses for the technology.

EchoSpeech, in its current form, can be used to communicate with other users via smartphones in places where conversation is inconvenient or inappropriate, such as noisy restaurants or quiet libraries. The Silent Speech Interface can also be used in conjunction with a stylus in design software such as CAD. No keyboard and mouse required.

Equipped with a microphone and speaker pair smaller than a pencil eraser, the EchoSpeech glasses become a wearable, AI-powered sonar system that sends and receives sound waves across the face and senses mouth movements. A deep learning algorithm, developed by researchers at SciFi Lab, analyzes these Echohips profiles in real time with an accuracy of about 95%.

“We’re running sonar on our bodies,” he said. Chen ChanCornell Ann S. Bowers, Assistant Professor of Information Science at the College of Computing and Information Science and Director of the SciFi Lab.

“We are very excited about this system,” he said. Small, low-power, and privacy-friendly, all of which are important features for deploying new wearable technologies into the real world. ”

SciFi Lab has developed several wearable devices that track. body, hand and facial Movement using machine learning and a small wearable video camera. Recently, labs have moved away from cameras and into acoustic sensing for facial and body movement tracking, citing improved battery life. Enhanced security and privacy. Smaller, more compact hardware. EchoSpeech is built from similar acoustic sensing devices in the lab. EarIOwearable earbuds that track facial movements.

Most techniques for silent speech recognition are limited to a pre-determined selection set of commands and require the user to face or wear the camera, which is neither practical nor feasible. No, says Cheng Zhang. Wearables His cameras also have significant privacy concerns for both users and those with whom they interact, he said.

Acoustic sensing technologies such as EchoSpeech eliminate the need for wearable video cameras. And because audio data is much smaller than image and video data, it requires less bandwidth to process and can be relayed to smartphones in real time via Bluetooth. Francois GuimbretièreCornell Bowers, Professor of Computer Science, CIS, and co-author.

“And since the data is processed locally on the smartphone rather than being uploaded to the cloud, privacy-sensitive information is never under the control of the user.”

According to Cheng Zhang, battery life will also improve exponentially.

The team is looking to commercialize the technology behind EchoSpeech. Ignite: Cornell Labs Brings Gap Funds To Market.

Researchers at SciFi Lab are studying smart glasses applications for tracking facial, eye, and upper-body movements.

Cheng Zhang said:

Other co-authors were computer science doctoral students Ke Li, Yihong Hao ’24, Yufan Wang ’24, and Zhengnan Lai ’25. This research was partially funded by the National Science Foundation.

Louis DiPietro is a writer at Cornell Ann S. Bowers College of Computing and Information Science.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *