Sunday , February 28 2021

Engineers translate brain signals directly into speech


Credit: public domain CC0

In a first scientific place, Columbia neuroengineers have created a system that translates thought into an intelligible and recognizable discourse. Following the brain activity of someone, technology can rebuild the words that a person feels unprecedentedly clear. This breakthrough, which takes advantage of the power of speech synthesizers and artificial intelligence, could generate new ways for computers to communicate directly with the brain. It also establishes the bases to help people who can not speak, like those who live with amyotrophic lateral sclerosis (ALS) or recovering from a stroke, regain their ability to communicate with the outside world.

These results were published today in Scientific reports.

"Our voices help us connect with our friends, family and the world around us, so losing the power of the voice for injuries or illness is so devastating," said Nima Mesgarani, Ph.D. principal investigator of the Institute of Brain Behavior of the mind of Mortimer B. Zuckerman of the University of Columbia. "With the study today, we have a potential way to restore this power. We have shown that, with the right technology, the thoughts of these people could be decoded and understood by any listener."

Decades of research have shown that when people talk, or even imagine, speaking-telltale patterns appear in the brain. A distinctive (but recognized) signal signal also arises when we hear someone speak, or imagine listening. Experts, trying to register and decode these patterns, see a future in which thoughts should not remain hidden within the brain, they could be translated into a verbal discourse at will.

But to achieve this feat has shown a challenge. The first efforts to decipher brain signals from Dr. Mesgarani and others focused on simple computer models that analyzed spectrograms, which are visual representations of sound frequencies.

But because this approach could not produce anything like an intelligible speech, the team of Dr. Mesgarani became a vocoder, a computer algorithm that can synthesize the speech after having been trained in recordings of people speaking.

"This is the same technology that Amazon Echo and Apple Siri use to give verbal answers to our questions," said Dr. Mesgarani, who is also an associate professor of electrical engineering at the Columbia Fu Foundation School of Engineering and Applied Sciences.

A representation of early approaches to rebuild discourse, using linear models and spectrograms. Credit: Nima Mesgarani / Columbia Zuckerman Institute

To teach vocoder to interpret brain activity, Dr. Mesgarani was associated with Ashesh Dinesh Mehta, MD, Ph.D., a neurosurgeon at the Northwell Health Physician Partners Neuroscience Institute and co-author of the document today. Dr. Mehta treats patients with epilepsy, some of whom have to undergo regular surgery.

"Working with Dr. Mehta, we asked patients with epilepsy who were already undergoing brain surgery to hear phrases spoken by different people, while we measured patterns of brain activity," said Dr. Mesgarani. "These neuronal patterns formed the vocoder".

Next, the researchers asked the same patients to listen to the speakers who recited digits between 0 and 9, while recording brain signals that could be executed through the vocoder. The sound produced by the vocoder in response to these signals was analyzed and cleaned by neural networks, a type of artificial intelligence that imitates the structure of neurons in the biological brain.

Representation of the new approach of Dr. Mesgarani uses a vocoder and a deep neural network to reconstruct the speech. Credit: Nima Mesgarani / Columbia Zuckerman Institute

The final result was a robotic sound voice that recited a sequence of numbers. To test the accuracy of the recording, Dr. Mesgarani and his team ordered people to listen to the recording and inform what they had heard.

"We have found that people could understand and repeat sounds about 75% of the time, which is well above and beyond the previous attempts," said Dr. Mesgarani. The improvement of intelligibility was especially evident when the new recordings were compared with the previous attempts based on spectrometers. "Sensitive vocoder and powerful neural networks represented the sounds that patients had originally heard with surprising precision."

Dr. Mesgarani and his team try to prove the following more complicated words and phrases, and they want to do the same tests on brain signals when a person talks or imagines speaking. In short, they expect their system to be part of an implant, similar to those used by patients with epilepsy, which translates the consumer's thoughts directly into the words.

"In this case, if the consumer thinks that" I need a glass of water ", our system could take the brain signals generated by this thought and turn them into a synthesized and verbal speech," said Dr. Mesgarani. "This would be a game changer, it would give anyone who has lost his ability to speak, either by injury or illness, with the renewed possibility of connecting with the world around them."

This article is titled "Towards the reconstruction of the intelligible speech of the human auditory cortex."

The cognitive audience filters the noise

Provided by
University of Columbia

Engineers translate brain signals directly to speech (2019, January 29)
recovered on January 29, 2019

This document is subject to author rights. Apart from any treatment just for the purpose of private study or research, no
You can play a part without the written permission. The content is only provided for informational purposes.

Source link