A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy

Neuron. 2018 May 2;98(3):630-644.e16. doi: 10.1016/j.neuron.2018.03.044. Epub 2018 Apr 19.


A core goal of auditory neuroscience is to build quantitative models that predict cortical responses to natural sounds. Reasoning that a complete model of auditory cortex must solve ecologically relevant tasks, we optimized hierarchical neural networks for speech and music recognition. The best-performing network contained separate music and speech pathways following early shared processing, potentially replicating human cortical organization. The network performed both tasks as well as humans and exhibited human-like errors despite not being optimized to do so, suggesting common constraints on network and human performance. The network predicted fMRI voxel responses substantially better than traditional spectrotemporal filter models throughout auditory cortex. It also provided a quantitative signature of cortical representational hierarchy-primary and non-primary responses were best predicted by intermediate and late network layers, respectively. The results suggest that task optimization provides a powerful set of tools for modeling sensory systems.

Keywords: auditory cortex; convolutional neural network; deep learning; deep neural network; encoding models; fMRI; hierarchy; human auditory cortex; natural sounds; word recognition.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acoustic Stimulation / methods*
  • Adolescent
  • Adult
  • Aged
  • Auditory Cortex / diagnostic imaging*
  • Auditory Cortex / physiology*
  • Female
  • Forecasting
  • Humans
  • Magnetic Resonance Imaging / methods*
  • Male
  • Middle Aged
  • Nerve Net / diagnostic imaging*
  • Nerve Net / physiology*
  • Psychomotor Performance / physiology*
  • Young Adult