Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Sep;22(9):1469-1476.
doi: 10.1038/s41593-019-0458-4. Epub 2019 Aug 12.

Emergent tuning for learned vocalizations in auditory cortex

Affiliations

Emergent tuning for learned vocalizations in auditory cortex

Jordan M Moore et al. Nat Neurosci. 2019 Sep.

Abstract

Vocal learners use early social experience to develop auditory skills specialized for communication. However, it is unknown where in the auditory pathway neural responses become selective for vocalizations or how the underlying encoding mechanisms change with experience. We used a vocal tutoring manipulation in two species of songbird to reveal that tuning for conspecific song arises within the primary auditory cortical circuit. Neurons in the deep region of primary auditory cortex responded more to conspecific songs than to other species' songs and more to species-typical spectrotemporal modulations, but neurons in the intermediate (thalamorecipient) region did not. Moreover, birds that learned song from another species exhibited parallel shifts in selectivity and tuning toward the tutor species' songs in the deep but not the intermediate region. Our results locate a region in the auditory processing hierarchy where an experience-dependent coding mechanism aligns auditory responses with the output of a learned vocal motor behavior.

PubMed Disclaimer

Conflict of interest statement

COMPETING INTERESTS

The authors declare no competing interests.

Figures

Fig. 1.
Fig. 1.
Juvenile songbirds learn song from conspecific or heterospecific tutors. a, Spectrograms of song segments from an adult zebra finch and adult long-tailed finch tutored by conspecifics (zfZF and lfLF, respectively; tutor songs in Fig. 2), a Bengalese finch tutor (BF), and its adult Bengalese finch (bfBF), zebra finch (zfBF, brown) and long-tailed finch (lfBF, light blue) pupils. Symbols denote BF syllable types and corresponding pupil copies, with high-magnification spectrograms of examples shown to the right. b, Both normal and cross-tutored birds learned most of their syllable repertoire from their tutor [n = 25 (zfZF), 11 (zfBF), 3 (bfBF), 10 (lfBF), and 12 (lfLF) birds; Tukey-Kramer post hoc tests, all P > 0.06]. c, Pupils in all groups reproduced their tutor’s syllables accurately (filled box-and-whisker plots), though zfBF birds produced syllables that were less similar to their tutors’ than did zfZF, bfBF, or lfBF pupils [n = 192 (zfZF), 113 (zfBF), 32 (bfBF), 97 (lfBF), and 74 (lfLF) syllable types; ANOVAs used bird identity as a nested covariate, *P < 0.05, **P < 0.01, ***P < 0.001]. For reference, open boxplots along the top show similarity between different renditions of the same syllable type within pupils [n = 192 (zfZF), 113 (zfBF), 32 (bfBF), 97 (lfBF), and 74 (lfLF) syllable types], and those along the bottom show similarity between different syllable types of pupils and tutors [n = 1287 (zfZF), 1733 (zfBF), 488 (bfBF), 1210 (lfBF), and 424 (lfLF) comparisons]. For b and c, the measure of center is the median, box limits show the 25th and 75th percentiles, whiskers extend up to 1.5× the interquartile range beyond the quartiles; and circles show outliers.
Fig. 2.
Fig. 2.
Selectivity for conspecific song emerges in primary auditory cortex. a, Schematic of the songbird auditory system in which shades of green indicate cortical region (intermediate, superficial, deep, secondary) and lines show major projections between them. b, Spike rasters show song-evoked responses of a single neuron from a zfZF (orange, deep region) and a lfLF (gray, secondary region) bird to ZF (top) and LF (bottom) songs. Lines above spectrograms show log-transformed amplitude envelopes used to delineate syllable boundaries (indicated by boxes below spectrograms), and rows in each raster show the spike times during an individual trial. c, Spike rates of the same two neurons shown in b to ZF versus LF syllables with responses to different songs organized in columns (arrows indicate the songs shown in b). Circles show the mean spike rates to each syllable (n = 13, 20, 17, 22, 22 syllables in ZF songs; n = 21, 33, 22, 13, 14 syllables in LF songs), solid black lines show the mean spike rates across all syllables per species, and dotted lines show spontaneous spike rates. Selectivity was computed as the difference in mean spike rate to the syllables of two species divided by their unpooled variance (t-statistic). d, Distributions of spike rate selectivity for ZF versus LF songs in zfZF (orange) and lfLF (gray) neurons in each AC region. All regions in zfZF birds had, on average, higher spike rates to ZF song (n = 148, 179, 281, 217 neurons per intermediate, superficial, deep, and secondary regions, respectively). In lfLF birds, intermediate-region neurons also had greater responses to ZF song, but the deep and secondary regions had greater responses to LF song (n = 168, 53, 237, 199 neurons per region). Thus, only the deep and secondary regions were selective for conspecific song in both species. Colored stars indicate a significant difference between song types within a group (repeated-measures ANOVAs with bird identity as a covariate) and are plotted on the side of the song that evoked a greater response. Black bars show the separation between distribution means, and black stars indicate a difference in selectivity between bird groups (ANOVAs with bird identity as a nested covariate). Dashed lines indicate the criteria for selectivity in single neurons (t = ±1.96). *P < 0.05, **P < 0.01, ***P < 0.001.
Fig. 3.
Fig. 3.
Song selectivity and population response dynamics are experience-dependent. a, Distributions of selectivity for ZF versus BF songs in zfZF birds (orange; n = 149, 181, 270, 197 neurons per AC region) and zfBF birds (brown; n = 73, 80, 136, 250 neurons per region). In zfZF birds, all regions had higher spike rates to ZF songs. In zfBF birds, the superficial and secondary regions exhibited no difference between songs, and selectivity in the deep region was shifted toward BF songs compared to normal birds. Colored stars indicate a significant response difference between song types within neurons (repeated-measures ANOVAs with bird identity as a covariate), black bars show the separation between distribution means, and black stars indicate a difference in selectivity between groups (ANOVAs with bird identity as a nested covariate). b, Spectrograms (0–8 kHz) of ZF (top) and BF (bottom) song segments plotted above deep-region pPSTHs (mean ± 95% C.I.) and neurograms (z-scored single-neuron PSTHs) from two birds in each group (n = 40 randomly selected neurons per bird). Colored lines above pPSTHs indicate sustained differences (≥10 ms) between groups, and bar graphs to the right show the number of segments in each ZF or BF stimulus that evoked a greater pPSTH in zfZF (orange) or zfBF (brown) birds (two-sided paired t-tests, n = 5 songs for each species). Traces to the right of neurograms show the selectivity of each respective neuron (dashed lines are t = ±1.96). c, Same as in a, but showing distributions of spike rate selectivity for LF versus BF songs in lfLF birds (gray, n = 164, 48, 224, 208 neurons per AC region) and lfBF birds (light blue, n = 133, 23, 75, 190 neurons per region). Both groups had greater responses to BF song in the intermediate region, but only lfLF birds had greater responses to LF songs in the deep and secondary regions. d, Same as in b, but showing spectrograms of LF and BF songs and deep-region pPSTHs and randomly selected neurograms from lfLF birds (n = 40 neurons per bird) and lfBF birds (n = 35 and 23 neurons per bird). For b and d, pPSTHs and neurograms were shifted in time by the average response latency of paired groups (zfZF and zfBF, 15 ms; lfLF and lfBF, 22 ms). *P < 0.05, **P < 0.01, ***P < 0.001.
Fig. 4.
Fig. 4.
Tuning for the spectrotemporal modulations in learned song emerges in parallel with song selectivity. a, Spectrograms (0–8 kHz) of ZF, LF, and BF syllables are shown above spectrograms of their best-fit ripples. b, Spectrograms of some ripples used as stimuli, organized by spectral modulation (harmonic) density and temporal modulation rate. c, Song modulation heat maps show the log-transformed proportions of ZF, LF, and BF songs (n = 5 each) composed of each spectrotemporal modulation frequency. Symbols indicate the modulation frequencies of ripples shown in a; contour lines delineate the primary modulations constituting 90% of each species’ songs. d, Neural response heat maps show the mean normalized spike rates to ripple stimuli from the intermediate (upper) and deep (lower) regions of each bird group (zfZF, n = 167, 296 neurons from intermediate and deep regions, respectively; zfBF, n = 89, 186 neurons; lfLF, n = 192, 255 neurons; lfBF, n = 162, 111 neurons). Pearson correlation coefficients show the relationships between mean response maps of each bird group (for all shown, P ≤ 0.001), and they were larger between birds that shared a tutor species (zfBF and lfBF) than between birds of the same species that had different tutor species (zfZF and zfBF; lfLF and lfBF). The tutor species’ song contour lines from c are overlaid on the tuning response maps. e, Box-and-whisker plots of within-neuron differences in spike rates evoked by ripples inside versus outside the song contour lines. Sample sizes are the same as in d; the measure of center is the median, box limits show the 25th and 75th percentiles, whiskers extend to minimum/maximum values. Repeated-measures ANOVAs with bird identity as a covariate, **P < 0.001, ***P < 0.001.
Fig. 5.
Fig. 5.
Neural population response dynamics to song reflect tuning for spectrotemporal modulations. a, Spectrograms (0–8 kHz) of different species’ syllables are plotted above their corresponding spectral modulation density (red) and temporal modulation rate (yellow) vectors and above deep-region pPSTHs (mean ± 95% C.I.). Syllable segments that evoked a greater response in zfZF birds (orange, n = 281 neurons) or lfLF birds (gray, n = 237 neurons) are indicated by horizontal lines. b, Same as a, but showing ZF and BF syllables and zfZF (n = 270 neurons) and zfBF (brown, n = 136 neurons) pPSTHs. c, Same as a, but showing LF and BF syllables and lfLF (n = 224 neurons) and lfBF (light blue, n = 75 neurons) pPSTHs. d, Left, Box-and-whisker plots show the spectral modulation densities of syllable segments (from ZF and LF songs combined) that evoked sustained differences (≥10 ms) between zfZF (orange) and lfLF (gray) pPSTHs. Top row shows data from intermediate-region pPSTHs [n = 183 (zfZF) and 127 (lfLF) segments], and bottom row shows data from deep-region pPSTHs [n = 188 (zfZF) and 101 (lfLF) segments]. Right, Spectral modulation tuning curves (mean ± 95% C.I.) of zfZF and lfLF birds diverge at the same spectral modulation frequencies as those in syllable segments that drive distinct pPSTH responses [int., n = 141 (zfZF) and 151 (lfLF) neurons; deep, n = 242 (zfZF) and 178 (lfLF) neurons]. e, Same as d but to ZF and BF songs and zfZF and zfBF birds from the intermediate [n = 90 (zfZF) and 47 (zfBF) syllable segments; n = 142 (zfZF) and 69 (zfBF) neurons] and deep regions [n = 65 (zfZF) and 67 (zfBF) segments; n = 234 (zfZF) and 123 (zfBF) neurons]. f, Same as d but to LF and BF songs and lfLF and lfBF birds from the intermediate [n = 107 (lfLF) and 73 (lfBF) syllable segments; n = 149 (lfLF) and 114 (lfBF) neurons] and deep regions [n = 103 (lfLF) and 50 (lfBF) segments; n = 175 (lfLF) and 54 (lfBF) neurons]. For the boxplots in d-f, the measure of center is the median, box limits show the 25th and 75th percentiles, and whiskers extend to minimum/maximum values. Tests between syllable segments were ANOVAs with stimulus species as a covariate; tests between tuning curves were ANOVAs with bird identity as a nested covariate, *P < 0.05, **P < 0.01, ***P < 0.001.
Fig. 6.
Fig. 6.
Neurons that respond selectively to same species’ songs have highly similar modulation tuning regardless of species identity or tutoring experience. a, Pie charts show the proportions of neurons from all AC regions (with raw numbers superimposed) in zfZF (top) and lfLF (bottom) birds that were selective for ZF song (orange), selective for LF song (gray), or not selective (open). Heat maps show the average modulation tuning maps of neurons selective for ZF (left) or LF (right) songs. b, Same as a but separating zfZF and zfBF neurons based on selectivity for ZF or BF songs. c, Same as a but separating lfLF and lfBF neurons based on selectivity for LF or BF songs. For all comparisons, mean tuning maps of neurons with the same song selectivity were positively correlated (between groups: 0.35 ≤ r ≤ 0.92, all P < 0.001), and maps of neurons from the same species but with different selectivity were not (within groups: −0.70 ≤ r ≤ −0.02). Modulation tuning maps for individual AC regions are shown in Fig. S15.

Comment in

Similar articles

Cited by

References

    1. Bradbury JW & Vehrencamp SL Principles of Animal Communication. 2 ed, (Sinauer Associates, 2011).
    1. Ord TJ & Stamps JA Species identity cues in animal communication. Am. Nat 174, 585–593 (2009). - PubMed
    1. Dooling RJ, Brown SD, Klump GM & Okanoya K Auditory perception of conspecific and heterospecific vocalizations in birds: evidence for special processes. J. Comp. Psychol 106, 20–28 (1992). - PubMed
    1. Woolley SMN, Fremouw TE, Hsu A & Theunissen FE Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds. Nat. Neurosci 8, 1371–1379 (2005). - PubMed
    1. Saffran JR, Werker JF & Werner LA in Handbook fo Child Development (eds Seigler R & Kuhn D) Ch. 2, 58–108 (Wiley, 2006).

Publication types