4.7 Article

Towards reconstructing intelligible speech from the human auditory cortex

Journal

SCIENTIFIC REPORTS
Volume 9, Issue -, Pages -

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41598-018-37359-z

Keywords

-

Funding

  1. National Institutes of Health [NIDCD-DC014279]
  2. National Institute of Mental Health [R21MH114166]
  3. Pew Charitable Trusts, Pew Biomedical Scholars Program

Ask authors/readers for more resources

Auditory stimulus reconstruction is a technique that finds the best approximation of the acoustic stimulus from the population of evoked neural activity. Reconstructing speech from the human auditory cortex creates the possibility of a speech neuroprosthetic to establish a direct communication with the brain and has been shown to be possible in both overt and covert conditions. However, the low quality of the reconstructed speech has severely limited the utility of this method for brain-computer interface (BCI) applications. To advance the state-of-the-art in speech neuroprosthesis, we combined the recent advances in deep learning with the latest innovations in speech synthesis technologies to reconstruct closed-set intelligible speech from the human auditory cortex. We investigated the dependence of reconstruction accuracy on linear and nonlinear (deep neural network) regression methods and the acoustic representation that is used as the target of reconstruction, including auditory spectrogram and speech synthesis parameters. In addition, we compared the reconstruction accuracy from low and high neural frequency ranges. Our results show that a deep neural network model that directly estimates the parameters of a speech synthesizer from all neural frequencies achieves the highest subjective and objective scores on a digit recognition task, improving the intelligibility by 65% over the baseline method which used linear regression to reconstruct the auditory spectrogram. These results demonstrate the efficacy of deep learning and speech synthesis algorithms for designing the next generation of speech BCI systems, which not only can restore communications for paralyzed patients but also have the potential to transform human-computer interaction technologies.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Multidisciplinary Sciences

Convergent evolution of face spaces across human face-selective neuronal groups and deep convolutional networks

Shany Grossman, Guy Gaziv, Erin M. Yeagle, Michal Harel, Pierre Megevand, David M. Groppe, Simon Khuvis, Jose L. Herrero, Michal Irani, Ashesh D. Mehta, Rafael Malach

NATURE COMMUNICATIONS (2019)

Article Clinical Neurology

Evoking highly focal percepts in the fingertips through targeted stimulation of sulcal regions of the brain for sensory restoration

Santosh Chandrasekaran, Stephan Bickel, Jose L. Herrero, Joo-Won Kim, Noah Markowitz, Elizabeth Espinal, Nikunj A. Bhagat, Richard Ramdeo, Junqian Xu, Matthew F. Glasser, Chad E. Bouton, Ashesh D. Mehta

Summary: The study demonstrates that focal sensory percepts can be evoked in the fingertips through sulcal stimulation, suggesting that minimally invasive sulcal stimulation via SEEG electrodes could be a clinically viable approach to restoring sensation.

BRAIN STIMULATION (2021)

Article Neurosciences

Inducing neuroplasticity through intracranial θ-burst stimulation in the human sensorimotor cortex

Jose L. Herrero, Alexander Smith, Akash Mishra, Noah Markowitz, Ashesh D. Mehta, Stephan Bickel

Summary: Theta-burst stimulation (TBS) protocols in transcranial magnetic stimulation studies have shown improved treatment efficacy in a variety of neuropsychiatric disorders. The optimal protocol to induce neuroplasticity in invasive direct electrical stimulation approaches is not known. We report that intracranial TBS applied in human sensorimotor cortex increases local coherence of preexistent beta rhythms. The effect is specific to the stimulation frequency and the stimulated network and outlasts the stimulation period by similar to 3 min.

JOURNAL OF NEUROPHYSIOLOGY (2021)

No Data Available