AI-equipped glasses read out unspoken words

AI News


summary: Researchers have developed a wearable interface called EchoSpeech. It recognizes silent speech by tracking lip and mouth movements through acoustic sensing and AI. The device requires minimal user training and recognizes up to 31 non-spoken commands. This system can be used to give a voice to those who cannot make sounds or who are unable to communicate quietly with others.

sauce: Cornell University

Researchers at Cornell University used acoustic sensing and artificial intelligence to develop a silent speech recognition interface that continuously recognizes up to 31 unspoken commands based on lip and mouth movements.

A low-power wearable interface called EchoSpeech requires minutes of user training data before a command can be recognized and executed by a smartphone.

Ruidong Zhang, a PhD student in computer science, is the lead author of “EchoSpeech: Continuous Silent Speech Recognition on Minimally-obtrusive Eyewear Powered by Acoustic Sensing,” presented at the Computing Machinery Conference on Computing Systems in Human Factors (CHI) this month. , will be held in Hamburg, Germany.

“For people who cannot produce sounds, this silent speech technology could be a good input to a speech synthesizer. I’m talking about the intended use.

EchoSpeech, in its current form, can be used to communicate with other users via smartphones in places where conversation is inconvenient or inappropriate, such as noisy restaurants or quiet libraries. Silent speech interfaces can also be used with design software such as CAD in combination with a stylus. No keyboard and mouse required.

This shows a man with glasses
A low-power wearable interface called EchoSpeech requires minutes of user training data before a command can be recognized and executed by a smartphone.image is public domain

Equipped with a microphone and speaker pair smaller than a pencil eraser, the EchoSpeech glasses become a wearable, AI-powered sonar system that sends and receives sound waves across the face and senses mouth movements. A deep learning algorithm then analyzes these echo profiles in real time. Accuracy is about 95%.

Cheng Zhang, Assistant Professor of Information Science and Director of Cornell University’s Smart Computer Interfaces for Future Interactions (SciFi) Lab, said:

“We are very excited about this system,” he said. Small size, low power, and privacy are all key features for deploying new wearable technologies into the real world. ”

Credit: Ruidong Zhang

Most techniques for silent speech recognition are limited to a pre-determined selection set of commands and require the user to face or wear the camera, which is neither practical nor feasible. No, says Cheng Zhang. Wearables His cameras also have significant privacy concerns for both users and those with whom they interact, he said.

Acoustic sensing technologies such as EchoSpeech eliminate the need for wearable video cameras. And because audio data is much smaller than image and video data, it requires less bandwidth to process and can be relayed to smartphones in real time via Bluetooth, says computer science professor François Guimbretière. .

“And since the data is processed locally on the smartphone rather than being uploaded to the cloud, privacy-sensitive information is never under the control of the user.”

Summary written with the help of ChatGPT AI technology.

About this AI research news

author: Becca Bowyer
sauce: Cornell University
contact: Becca Bowyer – Cornell University
image: image is public domain

Original research: The findings will be presented at the Conference on Human Factors in Computing Systems (CHI).



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *