Science
Efficient neural encoding of time intervals in speech and complex sound sequences
Key Points
Encoding time intervals in complex sound faces dual challenges: it must be precise and cover a broad dynamic range. In speech, for example, a ten millisecond lengthening of a syllable can signal stress or phrasal boundaries, yet the syllable duration distribution is long tailed beyond 500 ms and has variable statistics across speakers. Here, we propose that the auditory cortex employs efficient coding to represent time intervals.
Encoding time intervals in complex sound faces dual challenges: it must be precise and cover a broad dynamic range. In speech, for example, a ten millisecond lengthening of a syllable can signal stress or phrasal boundaries, yet the syllable duration distribution is long tailed beyond 500 ms and has variable statistics across speakers. Here, we propose that the auditory cortex employs efficient coding to represent time intervals. When listeners heard syllable sequences drawn from different duration distributions, the magnetoencephalographic (MEG) response from the temporal cortex scaled with syllable duration, which is characterized using the interval response function. Crucially, this interval response function met predictions of efficient coding. Its intercept and slope adapted to the mean and variance of syllable duration, respectively, and it consistently exhibited a compressive nonlinearity that reduced response skewness, consistent with a maximum entropy code. A computational model that constantly updates the inference of duration distribution provided an algorithmic account of this efficient coding, and intracranial electroencephalogram (iEEG) data confirmed the same principles during natural speech comprehension. Together, our findings reveal an efficient neural mechanism that supports precise encoding of highly variable time intervals in complex sound sequences.