What is the clinical viability of using speech neuroprosthetic technology to restore spoken communication?
Bottom Line: In closed vocabulary tests, listeners could readily identify and transcribe speech synthesized from cortical activity.
Explore This IssueNovember 2019
Background: Technology that translates neural activity into speech would be transformative for people who are unable to communicate as a result of neurological impairments. Decoding speech from neural activity is challenging because speaking requires very precise and rapid multi-dimensional control of vocal tract articulators. Although these systems can enhance a patient’s quality of life, most users struggle to transmit more than 10 words per minute, a rate far slower than the average of 150 words per minutes of natural speech. The study authors designed a neural decoder that explicitly leverages kinematic and sound representations encoded in human cortical activity to synthesize audible speech.
Study design: Recording of high-density electrocorticography (ECoG) signals from five participants who underwent intracranial monitoring for epilepsy treatment as they spoke several hundreds of sentences aloud. Tests were run on Amazon Mechanical Turk.
Setting: Department of Neurological Surgery, University of California San Francisco.
Synopsis: Recurrent neural networks first decoded directly recorded cortical activity into representations of articulatory movement, and then transformed these representations into speech acoustics. In closed vocabulary tests, listeners could readily identify and transcribe speech synthesized from cortical activity. Intermediate articulatory dynamics enhanced performance even with limited data. Decoded articulatory representations were highly conserved across speakers, enabling a component of the decoder to be transferrable across participants. Furthermore, the decoder could synthesize speech when a participant silently mimed sentences.
Citation: Anumanchipalli GK, Chartier J, Chang EF. Speech synthesis from neural decoding of spoken sentences. Nature. 2019;568:493-498.