Researchers at UC San Francisco have a developed a brain-machine interface that may allow speech-impaired patients to “speak” through the device. The researchers have described the system as a stepping stone to neural speech prostheses. The system monitors the brain activity of a user and then converts this to natural sounding speech using a virtual vocal tract. This computer simulation includes anatomically accurate representations of a larynx, tongue, lips, and jaw.
Patients can lose the ability to speak because of a variety of factors, including neurodegenerative diseases, strokes, and brain injuries. Current assistive technologies can allow certain patients to spell out words using small facial movements and other techniques. While these technologies are undoubtedly very useful, it can be time consuming to communicate in this way.
To provide a better way for such patients to communicate, researchers have developed a brain-machine interface that can translate activity in the speech centers of the brain to natural sounding speech. This is a complex undertaking, as the way the speech centers coordinate the movements of the vocal tract is complicated.
“The relationship between the movements of the vocal tract and the speech sounds that are produced is a complicated one,” said Gopala Anumanchipalli, a researcher involved in the study. “We reasoned that if these speech centers in the brain are encoding movements rather than sounds, we should try to do the same in decoding those signals.”
The researchers created a virtual vocal tract and then used machine learning to get it to produce the correct sounds. Volunteers said specific phrases aloud while their brain activity was monitored. Machine learning was used to match these neural signals to movements in the virtual vocal tract that produced a natural sound that closely matched the original phrase.
“We still have a ways to go to perfectly mimic spoken language,” said Josh Chartier, another researcher involved in the study. “We’re quite good at synthesizing slower speech sounds like ‘sh’ and ‘z’ as well as maintaining the rhythms and intonations of speech and the speaker’s gender and identity, but some of the more abrupt sounds like ‘b’s and ‘p’s get a bit fuzzy. Still, the levels of accuracy we produced here would be an amazing improvement in real-time communication compared to what’s currently available.”
The researchers hope that the technology could provide a convenient and powerful way to communicate for those who cannot speak.
“People who can’t move their arms and legs have learned to control robotic limbs with their brains,” said Chartier. “We are hopeful that one day people with speech disabilities will be able to learn to speak again using this brain-controlled artificial vocal tract.”
See the technology in action in this video:
Study in Nature: Speech synthesis from neural decoding of spoken sentences…