Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 21;24(8):2051-2062.e2.
doi: 10.1016/j.celrep.2018.07.076.

Joint Representation of Spatial and Phonetic Features in the Human Core Auditory Cortex

Affiliations

Joint Representation of Spatial and Phonetic Features in the Human Core Auditory Cortex

Prachi Patel et al. Cell Rep. .

Abstract

The human auditory cortex simultaneously processes speech and determines the location of a speaker in space. Neuroimaging studies in humans have implicated core auditory areas in processing the spectrotemporal and the spatial content of sound; however, how these features are represented together is unclear. We recorded directly from human subjects implanted bilaterally with depth electrodes in core auditory areas as they listened to speech from different directions. We found local and joint selectivity to spatial and spectrotemporal speech features, where the spatial and spectrotemporal features are organized independently of each other. This representation enables successful decoding of both spatial and phonetic information. Furthermore, we found that the location of the speaker does not change the spectrotemporal tuning of the electrodes but, rather, modulates their mean response level. Our findings contribute to defining the functional organization of responses in the human auditory cortex, with implications for more accurate neurophysiological models of speech processing.

Keywords: auditory cortex; binaural sound; electrocorticography; sound localization; speech.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Direction Selectivity of Responses in the Human Auditory Cortex
(A) Speech-responsive electrodes from all subjects shown on an ICBM152 average brain on axial MRI (left) and on core auditory areas (right). Color saturation indicates the speech versus silence t value for each electrode. (B) Task schematic. Subjects are presented with speech uttered from five color-coded angles in the horizontal plane. (C) Average high-gamma responses and ASIs of five representative electrodes to −90°, −45°, 0°, 45°, and 90° angles. (D) Hierarchical clustering of angle selectivity indices (ASIs) for all electrodes (columns) and angles (rows). Electrode clusters are shown by the dendrogram at the top, whereas angle clusters are shown by the dendrogram on the left; electrode clusters in red and blue indicate electrode locations in right and left brain hemispheres, respectively. (E) Histograms of ASI for a given sound angle; each electrode group is colored by its BA. (F) Average separability of direction (f statistic) in the Heschl’s gyrus, STG, and PT. The error bars indicate SE. ***p < 0.001. See also Figure S1.
Figure 2.
Figure 2.. Contralateral Tuning of Electrodes
(A) Average t values for left- versus right-sided angles for each electrode in the left and right brain hemispheres. (B) Histogram of the contralateral strength index (CSI) for left and right brain hemisphere electrodes. (C) CSI plotted on an ICBM152 average brain. Each electrode is colored according to its CSI value (red, positive CSI, indicating left angle preference; blue, negative CSI, indicating right angle preference). (D) Percentages of ipsilateral and contralateral electrodes found in the Heschl’s gyrus, STG, and PT. ***p < 0.001. See also Figure S3.
Figure 3.
Figure 3.. Independent Encoding of Spatial and Spectrotemporal Features
(A) Top: five example electrodes from one subject. Bottom: spectrotemporal receptive fields (STRFs) of these electrodes. Each column is one electrode, and each row is a different speech direction from which the STRF is calculated. The best frequency (BF) and response latency (RL) for each STRF are marked with a black dot. ASI vectors of each electrode are shown below. (B) Electrodes plotted on the core auditory cortex of ICBM152, color-coded by BF (right) and RL (left). (C) Histograms of correlation between STRFs from the same angle but different electrodes (blue) compared with STRFs from the same electrode but different angles (red). (D) RL and BF for right or left angles versus RL and BF for angle 0°. Electrodes are color-coded by angle (red, right side; blue, left side). (E) BF versus RL plot for all speech-responsive electrodes, colored by their best angle (BA) tuning. ***p < 0.001. See also Figure S4.
Figure 4.
Figure 4.. Joint Population Decoding of Spatial and Phonetic Features
(A) Confusion patterns for classifying angles (top) and manners of articulation (bottom) from all electrodes (n = 201). (B) Mean classification accuracy for varying numbers of electrodes for angle classification (top) and manner classification (bottom). Error bars denote standard deviation, and colors denote whether electrodes from single (red) or both (black) brain hemispheres were used for classification. (C) Average of Z-scored electrode responses (far left) separated by hemisphere of the brain and the weights assigned to electrodes by spatial (left) and manner (right) classifiers. Angle classifier weights are sorted by the ASI of electrodes, and manner classifier weights are sorted by the BF of electrodes. (D) Percentage increase in classification accuracy from a single to both brain hemispheres for manner and angle classification. (E) Scatterplot of the maximum weight given to each electrode by manner and angle classifiers, colored by electrode location. ***p < 0.001. See also Figure S5.
Figure 5.
Figure 5.. Mechanism of Joint Encoding of Spatial and Spectrotemporal Features at Individual Electrodes
(A)Scatterplot of percentage change of the mean (left) and standard deviation (right) of neural responses relative to the baseline (angle 0°) for angle 90° (x axis) and angle —90° (y axis) forall electrodes. (B) Proposed computational model. The auditory spectrogram of speech, S(t,f), is convolved with the electrode’s STRF and then modulated by a gain and a bias factor that depend on the direction of sound. (C) Mean reduced prediction error of the neural responses relative to baseline (non-spatial STRF) when modulating the gain, the bias, or both in the model. The error bars indicate SE. (D) Average bias values for five angles from all speech-responsive electrodes colored by right (red) and left (blue) brain hemispheres. (E) Mean response level (bias) values for five angles from each speech-responsive electrode arranged by ASI and colored by right (red) and left (blue) brain hemispheres. ***p < 0.001. See also Figure S6.

Similar articles

Cited by

References

    1. Ahveninen J, Jȁȁskelȁinen IP, Raij T, Bonmassar G, Devore S, Hȁmȁlȁinen M, Levȁnen S, Lin F-H, Sams M, Shinn-Cunningham BG, et al. (2006). Task-modulated “what” and “where” pathways in human auditory cortex. Proc. Natl. Acad. Sci. USA 703, 14608–14613. - PMC - PubMed
    1. Ahveninen J, Kopco N, and Jȁȁskelȁinen IP (2014). Psychophysics and neuronal bases of sound localization in humans. Hear. Res. 307, 86–97. - PMC - PubMed
    1. Alain C, Arnott SR, Hevenor S, Graham S, and Grady CL (2001). “What” and “where” in the human auditory system. Proc. Natl. Acad. Sci. USA 98, 12301–12306. - PMC - PubMed
    1. Arnott SR, Binns MA, Grady CL, and Alain C (2004). Assessing the auditory dual-pathway model in humans. Neuroimage 22, 401–408. - PubMed
    1. Arsenault JS, and Buchsbaum BR (2015). Distributed neural representations of phonological features during speech perception. J. Neurosci. 35, 634–642. - PMC - PubMed

Publication types