On the mechanisms involved in the recovery of envelope information from temporal fine structure.
Ontology highlight
ABSTRACT: Three experiments were designed to provide psychophysical evidence for the existence of envelope information in the temporal fine structure (TFS) of stimuli that were originally amplitude modulated (AM). The original stimuli typically consisted of the sum of a sinusoidally AM tone and two unmodulated tones so that the envelope and TFS could be determined a priori. Experiment 1 showed that normal-hearing listeners not only perceive AM when presented with the Hilbert fine structure alone but AM detection thresholds are lower than those observed when presenting the original stimuli. Based on our analysis, envelope recovery resulted from the failure of the decomposition process to remove the spectral components related to the original envelope from the TFS and the introduction of spectral components related to the original envelope, suggesting that frequency- to amplitude-modulation conversion is not necessary to recover envelope information from TFS. Experiment 2 suggested that these spectral components interact in such a way that envelope fluctuations are minimized in the broadband TFS. Experiment 3 demonstrated that the modulation depth at the original carrier frequency is only slightly reduced compared to the depth of the original modulator. It also indicated that envelope recovery is not specific to the Hilbert decomposition.
Project description:The majority of psychoacoustic research investigating sound localization has utilized stationary sources, yet most naturally occurring sounds are in motion, either because the sound source itself moves, or the listener does. In normal hearing (NH) listeners, previous research showed the extent to which sound duration and velocity impact the ability of listeners to detect sound movement. By contrast, little is known about how listeners with hearing impairments perceive moving sounds; the only study to date comparing the performance of NH and bilateral cochlear implant (BiCI) listeners has demonstrated significantly poorer performance on motion detection tasks in BiCI listeners. Cochlear implants, auditory protheses offered to profoundly deaf individuals for access to spoken language, retain the signal envelope (ENV), while discarding temporal fine structure (TFS) of the original acoustic input. As a result, BiCI users do not have access to low-frequency TFS cues, which have previously been shown to be crucial for sound localization in NH listeners. Instead, BiCI listeners seem to rely on ENV cues for sound localization, especially level cues. Given that NH and BiCI listeners differentially utilize ENV and TFS information, the present study aimed to investigate the usefulness of these cues for auditory motion perception. We created acoustic chimaera stimuli, which allowed us to test the relative contributions of ENV and TFS to auditory motion perception. Stimuli were either moving or stationary, presented to NH listeners in free field. The task was to track the perceived sound location. We found that removing low-frequency TFS reduces sensitivity to sound motion, and fluctuating speech envelopes strongly biased the judgment of sounds to be stationary. Our findings yield a possible explanation as to why BiCI users struggle to identify sound motion, and provide a first account of cues important to the functional aspect of auditory motion perception.
Project description:The temporal envelope and fine structure of speech make distinct contributions to the perception of speech in normal-hearing listeners, and are differentially affected by room reverberation. Previous work has demonstrated enhanced speech intelligibility in reverberant rooms when prior exposure to the room was provided. Here, the relative contributions of envelope and fine structure cues to this intelligibility enhancement were tested using an open-set speech corpus and virtual auditory space techniques to independently manipulate the speech cues within a simulated room. Intelligibility enhancement was observed only when the envelope was reverberant, indicating that the enhancement is envelope-based.
Project description:The present study investigated the role and relative contribution of envelope and temporal fine structure (TFS) to sentence recognition in noise. Target and masker stimuli were added at five different signal-to-noise ratios (SNRs) and filtered into 30 contiguous frequency bands. The envelope and TFS were extracted from each band by Hilbert decomposition. The final stimuli consisted of the envelope of the target/masker sound mixture at x dB SNR and the TFS of the same sound mixture at y dB SNR. A first experiment showed a very limited contribution of TFS cues, indicating that sentence recognition in noise relies almost exclusively on temporal envelope cues. A second experiment showed that replacing the carrier of a sound mixture with noise (vocoder processing) cannot be considered equivalent to disrupting the TFS of the target signal by adding a background noise. Accordingly, a re-evaluation of the vocoder approach as a model to further understand the role of TFS cues in noisy situations may be necessary. Overall, these data are consistent with the view that speech information is primarily extracted from the envelope while TFS cues are primarily used to detect glimpses of the target.
Project description:Since 2005, thousands of genome-wide association studies (GWAS) have been published, identifying hundreds of thousands of genetic variants that increase risk of complex traits such as autoimmune diseases. This wealth of data has the potential to improve patient care, through personalized medicine and the identification of novel drug targets. However, the potential of GWAS for clinical translation has not been fully achieved yet, due to the fact that the functional interpretation of risk variants and the identification of causal variants and genes are challenging. The past decade has seen the development of great advances that are facilitating the overcoming of these limitations, by utilizing a plethora of genomics and epigenomics tools to map and characterize regulatory elements and chromatin interactions, which can be used to fine map GWAS loci, and advance our understanding of the biological mechanisms that cause disease.
Project description:There is much debate on how the spectrotemporal modulations of speech (or its spectrogram) are encoded in the responses of the auditory nerve, and whether speech intelligibility is best conveyed via the "envelope" (E) or "temporal fine-structure" (TFS) of the neural responses. Wide use of vocoders to resolve this question has commonly assumed that manipulating the amplitude-modulation and frequency-modulation components of the vocoded signal alters the relative importance of E or TFS encoding on the nerve, thus facilitating assessment of their relative importance to intelligibility. Here we argue that this assumption is incorrect, and that the vocoder approach is ineffective in differentially altering the neural E and TFS. In fact, we demonstrate using a simplified model of early auditory processing that both neural E and TFS encode the speech spectrogram with constant and comparable relative effectiveness regardless of the vocoder manipulations. However, we also show that neural TFS cues are less vulnerable than their E counterparts under severe noisy conditions, and hence should play a more prominent role in cochlear stimulation strategies.
Project description:The speech signal may be divided into spectral frequency-bands, each band containing temporal properties of the envelope and fine structure. This study measured the perceptual weights for the envelope and fine structure in each of three frequency bands for sentence materials in young normal-hearing listeners, older normal-hearing listeners, aided older hearing-impaired listeners, and spectrally matched young normal-hearing listeners. The availability of each acoustic property was independently varied through noisy signal extraction. Thus, the full speech stimulus was presented with noise used to mask six different auditory channels. Perceptual weights were determined by correlating a listener's performance with the signal-to-noise ratio of each acoustic property on a trial-by-trial basis. Results demonstrate that temporal fine structure perceptual weights remain stable across the four listener groups. However, a different weighting typography was observed across the listener groups for envelope cues. Results suggest that spectral shaping used to preserve the audibility of the speech stimulus may alter the allocation of perceptual resources. The relative perceptual weighting of envelope cues may also change with age. Concurrent testing of sentences repeated once on a previous day demonstrated that weighting strategies for all listener groups can change, suggesting an initial stabilization period or susceptibility to auditory training.
Project description:The speech signal contains many acoustic properties that may contribute differently to spoken word recognition. Previous studies have demonstrated that the importance of properties present during consonants or vowels is dependent upon the linguistic context (i.e., words versus sentences). The current study investigated three potentially informative acoustic properties that are present during consonants and vowels for monosyllabic words and sentences. Natural variations in fundamental frequency were either flattened or removed. The speech envelope and temporal fine structure were also investigated by limiting the availability of these cues via noisy signal extraction. Thus, this study investigated the contribution of these acoustic properties, present during either consonants or vowels, to overall word and sentence intelligibility. Results demonstrated that all processing conditions displayed better performance for vowel-only sentences. Greater performance with vowel-only sentences remained, despite removing dynamic cues of the fundamental frequency. Word and sentence comparisons suggest that the speech envelope may be at least partially responsible for additional vowel contributions in sentences. Results suggest that speech information transmitted by the envelope is responsible, in part, for greater vowel contributions in sentences, but is not predictive for isolated words.
Project description:The ability to discriminate changes in the fine structure of complex sounds is well developed in birds. However, the precise limit of this discrimination ability and how it is used in the context of natural communication remains unclear. Here we describe natural variability in acoustic fine structure of male and female zebra finch calls. Results from psychoacoustic experiments demonstrate that zebra finches are able to discriminate extremely small differences in fine structure, which are on the order of the variation in acoustic fine structure that is present in their vocal signals. Results from signal analysis methods also suggest that acoustic fine structure may carry information that distinguishes between biologically relevant categories including sex, call type and individual identity. Combined, our results are consistent with the hypothesis that zebra finches can encode biologically relevant information within the fine structure of their calls. This study provides a foundation for our understanding of how acoustic fine structure may be involved in animal communication.
Project description:Zebra finches produce a learned song that is rich in harmonic structure and highly stereotyped. More is generally known about how birds learn and produce this song than how they perceive it. Here, zebra finches were trained with operant techniques to discriminate changes in natural and synthetic song motifs. Results show that zebra finches are quite insensitive to changes to the overall envelope of the motif since they were unable to discriminate more than a doubling in inter-syllable interval durations. By contrast, they were quite sensitive to changes in individual syllables. A series of tests with synthetic song syllables, including some made of frozen noise and Schroeder harmonic complexes, showed that birds used a suite of acoustic cues in normal listening but they could also distinguish among syllables simply on the basis of the temporal fine structure in the waveform. Thus, while syllable perception is maintained by multiple redundant cues, temporal fine structure features alone are sufficient for syllable discrimination and may be more important for communication than previously thought.
Project description:At least two modes of pitch perception exist: in one, the fundamental frequency (F0) of harmonic complex tones is estimated using the temporal fine structure (TFS) of individual low-order resolved harmonics; in the other, F0 is derived from the temporal envelope of high-order unresolved harmonics that interact in the auditory periphery. Pitch is typically more accurate in the former than in the latter mode. Another possibility is that pitch can sometimes be coded via the TFS from unresolved harmonics. A recent study supporting this third possibility [Moore et al. (2006a). J. Acoust. Soc. Am. 119, 480-490] based its conclusion on a condition where phase interaction effects (implying unresolved harmonics) accompanied accurate F0 discrimination (implying TFS processing). The present study tests whether these results were influenced by audible distortion products. Experiment 1 replicated the original results, obtained using a low-level background noise. However, experiments 2-4 found no evidence for the use of TFS cues with unresolved harmonics when the background noise level was raised, or the stimulus level was lowered, to render distortion inaudible. Experiment 5 measured the presence and phase dependence of audible distortion products. The results provide no evidence that TFS cues are used to code the F0 of unresolved harmonics.