It may not yet be a case of technology allowing people to actually hear the words you’re planning to say, but scientists have created a device that could convert a person’s brain patterns into some form of intelligible speech.
In a study published Tuesday in the journal Scientific Advances, a team of researchers led by Columbia University neuroscientist Nima Mesgarani detailed their recently developed interface, which involved the use of deep learning and speech synthesis technologies. As explained by Gizmodo, the neuroprosthetic device works by gathering patterns from the brain’s auditory cortex, then using an “AI-powered” vocoder to translate the patterns into what was described as intelligible, albeit “very robotic sounding” speech.
While the contraption has the potential of giving voices to people who have lost the ability to speak, Gizmodo stressed that it still has some limitations. Aside from still being in the primitive stages of development, the publication clarified that the device does not specifically translate a person’s “covert speech,” or the words an individual is thinking about, but not actually saying. What the interface does capture is a person’s cognitive response based on the recordings of other people speaking. These responses are then decoded by a deep neural network and converted into reconstructed versions of spoken patterns.
Further explaining the methodologies used by Mesgarani and his colleagues, Gizmodo wrote that the team recruited five epilepsy patients as volunteers and used invasive electrocorticography to track their neural activity while listening to “continuous” speech sounds, such as recordings of people reciting the numbers zero to nine repeatedly.
Once the brain patterns were converted into synthesized speech, a separate group of 11 participants was asked to listen to the results, according to Futurism. About 75 percent of listeners were able to understand the spoken digits, while most successfully recognized whether the speaker was male or female.
— Gizmodo (@Gizmodo) January 29, 2019
In an email to Gizmodo, Mesgarani explained that the main goal of his team’s study was to “recover the sound” while translating brain patterns into speech, but stressed that speech is more than just stringing words together into sentences.
“It is possible to also decode phonemes [distinct units of sound] or words, however, speech has a lot more information than just the content—such as the speaker [with their distinct voice and style], intonation, emotional tone, and so on,” he added.
Going forward, the researchers hope to refine their technology by replicating the study without the need for participants to listen to recordings. As quoted by Futurism, Mesgarani said this is akin to a participant thinking of a phrase like “I need a glass of water” and his team’s device converting that same thought into synthesized speech.
“This would be a game changer,” Mesgarani concluded. “It would give anyone who has lost their ability to speak, whether through injury or disease, the renewed chance to connect to the world around them.”