Bilateral Versus Unilateral Cochlear Implantation in Adult Listeners: Speech-On-Speech Masking and Multitalker Localization.
ABSTRACT: Binaural hearing helps normal-hearing listeners localize sound sources and understand speech in noise. However, it is not fully understood how far this is the case for bilateral cochlear implant (CI) users. To determine the potential benefits of bilateral over unilateral CIs, speech comprehension thresholds (SCTs) were measured in seven Japanese bilateral CI recipients using Helen test sentences (translated into Japanese) in a two-talker speech interferer presented from the front (co-located with the target speech), ipsilateral to the first-implanted ear (at +90° or -90°), and spatially symmetric at ±90°. Spatial release from masking was calculated as the difference between co-located and spatially separated SCTs. Localization was assessed in the horizontal plane by presenting either male or female speech or both simultaneously. All measurements were performed bilaterally and unilaterally (with the first implanted ear) inside a loudspeaker array. Both SCTs and spatial release from masking were improved with bilateral CIs, demonstrating mean bilateral benefits of 7.5 dB in spatially asymmetric and 3 dB in spatially symmetric speech mixture. Localization performance varied strongly between subjects but was clearly improved with bilateral over unilateral CIs with the mean localization error reduced by 27°. Surprisingly, adding a second talker had only a negligible effect on localization.
Project description:Spatial release from masking was studied in a three-talker soundfield listening experiment. The target talker was presented at 0 degrees azimuth and the maskers were either colocated or symmetrically positioned around the target, with a different masker talker on each side. The symmetric placement greatly reduced any "better ear" listening advantage. When the maskers were separated from the target by +/-15 degrees , the average spatial release from masking was 8 dB. Wider separations increased the release to more than 12 dB. This large effect was eliminated when binaural cues and perceived spatial separation were degraded by covering one ear with an earplug and earmuff. Increasing reverberation in the room increased the target-to-masker ratio (TM) for the separated, but not colocated, conditions reducing the release from masking, although a significant advantage of spatial separation remained. Time reversing the masker speech improved performance in both the colocated and spatially separated cases but lowered TM the most for the colocated condition, also resulting in a reduction in the spatial release from masking. Overall, the spatial tuning observed appears to depend on the presence of interaural differences that improve the perceptual segregation of sources and facilitate the focus of attention at a point in space.
Project description:Hearing loss with increasing age adversely affects the ability to understand speech, an effect that results partly from reduced audibility. The aims of this study were to establish whether aging reduces speech intelligibility for listeners with normal audiograms, and, if so, to assess the relative contributions of auditory temporal and cognitive processing. Twenty-one older normal-hearing (ONH; 60-79 years) participants with bilateral audiometric thresholds ? 20 dB HL at 0.125-6 kHz were matched to nine young (YNH; 18-27 years) participants in terms of mean audiograms, years of education, and performance IQ. Measures included: (1) identification of consonants in quiet and in noise that was unmodulated or modulated at 5 or 80 Hz; (2) identification of sentences in quiet and in co-located or spatially separated two-talker babble; (3) detection of modulation of the temporal envelope (TE) at frequencies 5-180 Hz; (4) monaural and binaural sensitivity to temporal fine structure (TFS); (5) various cognitive tests. Speech identification was worse for ONH than YNH participants in all types of background. This deficit was not reflected in self-ratings of hearing ability. Modulation masking release (the improvement in speech identification obtained by amplitude modulating a noise background) and spatial masking release (the benefit obtained from spatially separating masker and target speech) were not affected by age. Sensitivity to TE and TFS was lower for ONH than YNH participants, and was correlated positively with speech-in-noise (SiN) identification. Many cognitive abilities were lower for ONH than YNH participants, and generally were correlated positively with SiN identification scores. The best predictors of the intelligibility of SiN were composite measures of cognition and TFS sensitivity. These results suggest that declines in speech perception in older persons are partly caused by cognitive and perceptual changes separate from age-related changes in audiometric sensitivity.
Project description:This study examined spatial release from masking (SRM) when a target talker was masked by competing talkers or by other types of sounds. The focus was on the role of interaural time differences (ITDs) and time-varying interaural level differences (ILDs) under conditions varying in the strength of informational masking (IM). In the first experiment, a target talker was masked by two other talkers that were either colocated with the target or were symmetrically spatially separated from the target with the stimuli presented through loudspeakers. The sounds were filtered into different frequency regions to restrict the available interaural cues. The largest SRM occurred for the broadband condition followed by a low-pass condition. However, even the highest frequency bandpass-filtered condition (3-6 kHz) yielded a significant SRM. In the second experiment the stimuli were presented via earphones. The listeners identified the speech of a target talker masked by one or two other talkers or noises when the maskers were colocated with the target or were perceptually separated by ITDs. The results revealed a complex pattern of masking in which the factors affecting performance in colocated and spatially separated conditions are to a large degree independent.
Project description:Speech-on-speech recognition differs substantially across stimuli, but it is unclear what role linguistic features of the masker play in this variability. The linguistic similarity hypothesis suggests similarity between sentence-level semantic content of the target and masker speech increases masking. Sentence recognition in a two-talker masker was evaluated with respect to semantic content and syntactic structure of the masker (experiment 1) and linguistic similarity of the target and masker (experiment 2). Target and masker sentences were semantically meaningful or anomalous. Masker syntax was varied or the same across sentences. When other linguistic features of the masker were controlled, variability in syntactic structure across masker tokens was only relevant when the masker was played continuously (as opposed to gated); when played continuously, sentence-recognition thresholds were poorer with variable than consistent masker syntax, but this effect was small (0.5?dB). When the syntactic structure of the masker was held constant, semantic meaningfulness of the masker did not increase masking, and at times performance was better for the meaningful than the anomalous masker. These data indicate that sentence-level semantic content of the masker speech does not influence speech-on-speech masking. Further, no evidence that similarities between target/masker sentence-level semantic content increases masking was found.
Project description:To hear out a conversation against other talkers listeners overcome energetic and informational masking. Largely attributed to top-down processes, information masking has also been demonstrated using unintelligible speech and amplitude-modulated maskers suggesting bottom-up processes. We examined the role of speech-like amplitude modulations in information masking using a spatial masking release paradigm. Separating a target talker from two masker talkers produced a 20 dB improvement in speech reception threshold; 40% of which was attributed to a release from informational masking. When across frequency temporal modulations in the masker talkers are decorrelated the speech is unintelligible, although the within frequency modulation characteristics remains identical. Used as a masker as above, the information masking accounted for 37% of the spatial unmasking seen with this masker. This unintelligible and highly differentiable masker is unlikely to involve top-down processes. These data provides strong evidence of bottom-up masking involving speech-like, within-frequency modulations and that this, presumably low level process, can be modulated by selective spatial attention.
Project description:Speech intelligibility was measured for sentences presented in spectrally matched steady noise, single-talker interference, or speech-modulated noise. The stimuli were unfiltered or were low-pass (LP) (1200 Hz cutoff) or high-pass (HP) (1500 Hz cutoff) filtered. The cutoff frequencies were selected to produce equal performance in both LP and HP conditions in steady noise and to limit access to the temporal fine structure of resolved harmonics in the HP conditions. Masking release, or the improvement in performance between the steady noise and single-talker interference, was substantial with no filtering. Under LP and HP filtering, masking release was roughly equal but was much less than in unfiltered conditions. When the average F0 of the interferer was shifted lower than that of the target, similar increases in masking release were observed under LP and HP filtering. Similar LP and HP results were also obtained for the speech-modulated-noise masker. The findings are not consistent with the idea that pitch conveyed by the temporal fine structure of low-order harmonics plays a crucial role in masking release. Instead, any reduction in speech redundancy, or manipulation that increases the target-to-masker ratio necessary for intelligibility to beyond around 0 dB, may result in reduced masking release.
Project description:This study evaluated the role of temporal fine structure in the lateralization and understanding of speech in six normal-hearing listeners. Interaural time differences (ITDs) were introduced to invoke lateralization. Speech reception thresholds (SRTs) were evaluated in backgrounds of two-talker babble and speech-shaped noise. Two-syllable words with ITDs of 0 and 700 micros were used as targets. A vocoder technique, which systematically randomized fine structure, was used to evaluate the effects of fine structure on these tasks. Randomization of temporal fine structure was found to significantly reduce the ability of normal-hearing listeners to lateralize words, although for many listeners, good lateralization performance was achieved with as much as 80% fine-structure randomization. Most listeners demonstrated some rudimentary ability to lateralize with 100% fine-structure randomization. When ITDs were 0 micros, randomization of fine structure had a much greater effect on SRT in two-talker babble than in speech-shaped noise. Binaural advantages were also observed. In steady noise, the difference in SRT between words with 0- vs 700-micros ITDs was, on average, 6 dB with no fine-structure randomization and 2 dB with 100% fine-structure randomization. In two-talker babble this difference was 1.9 dB and, for most listeners, showed little effect of the degree of fine-structure randomization. These results suggest that (1) improved delivery of temporal fine structure would improve speech understanding in noise for implant recipients, (2) bilateral implant recipients might benefit from temporal envelope ITDs, and (3) improved delivery of temporal information could improve binaural benefits.
Project description:People with schizophrenia exhibit impairments in target-speech recognition (TSR) against multiple-talker-induced informational speech masking. Up to date, the underlying neural mechanisms and its relationships with psychotic symptoms remain largely unknown. This study aimed to investigate whether the schizophrenia-associated TSR impairment contribute to certain psychotic symptoms by sharing underlying alternations in cortical gray-matter volume (GMV) with the psychotic symptoms. Participants with schizophrenia (<i>N</i> = 34) and their matched healthy controls (<i>N</i> = 29) were tested for TSR against a two-talker-speech masker. Psychotic symptoms of participants with schizophrenia were evaluated using the Positive and Negative Syndrome Scale. The regional GMV across various cortical regions was assessed using the voxel-based morphometry. The results of partial-correlation and mediation analyses showed that in participants with schizophrenia, the TSR was negatively correlated with the delusion severity, but positively with the GMV in the bilateral superior/middle temporal cortex, bilateral insular, left medial orbital frontal gyrus, left Rolandic operculum, left mid-cingulate cortex, left posterior fusiform, and left cerebellum. Moreover, the association between GMV and delusion was based on the mediating role played by the TSR performance. Thus, in people with schizophrenia, both delusions and the augmented vulnerability of TSR to informational masking are associated with each other and share the underlying cortical GMV reduction, suggesting that the origin of delusion in schizophrenia may be related to disorganized or limited informational processing (e.g., the incapability of adequately filtering information from multiple sources at the perceptual level). The TSR impairment can be a potential marker for predicting delusion severity.
Project description:<h4>Objectives</h4>This study aimed to evaluate the informational component of speech-on-speech masking. Speech perception in the presence of a competing talker involves not only informational masking (IM) but also a number of masking processes involving interaction of masker and target energy in the auditory periphery. Such peripherally generated masking can be eliminated by presenting the target and masker in opposite ears (dichotically). However, this also reduces IM by providing listeners with lateralization cues that support spatial release from masking (SRM). In tonal sequences, IM can be isolated by rapidly switching the lateralization of dichotic target and masker streams across the ears, presumably producing ambiguous spatial percepts that interfere with SRM. However, it is not clear whether this technique works with speech materials.<h4>Design</h4>Speech reception thresholds (SRTs) were measured in 17 young normal-hearing adults for sentences produced by a female talker in the presence of a competing male talker under three different conditions: diotic (target and masker in both ears), dichotic, and dichotic but switching the target and masker streams across the ears. Because switching rate and signal coherence were expected to influence the amount of IM observed, these two factors varied across conditions. When switches occurred, they were either at word boundaries or periodically (every 116 msec) and either with or without a brief gap (84 msec) at every switch point. In addition, SRTs were measured in a quiet condition to rule out audibility as a limiting factor.<h4>Results</h4>SRTs were poorer for the four switching dichotic conditions than for the nonswitching dichotic condition, but better than for the diotic condition. Periodic switches without gaps resulted in the worst SRTs compared to the other switch conditions, thus maximizing IM.<h4>Conclusions</h4>These findings suggest that periodically switching the target and masker streams across the ears (without gaps) was the most efficient in disrupting SRM. Thus, this approach can be used in experiments that seek a relatively pure measure of IM, and could be readily extended to translational research.
Project description:Are musicians better able to understand speech in noise than non-musicians? Recent findings have produced contradictory results. Here we addressed this question by asking musicians and non-musicians to understand target sentences masked by other sentences presented from different spatial locations, the classical 'cocktail party problem' in speech science. We found that musicians obtained a substantial benefit in this situation, with thresholds ~6?dB better than non-musicians. Large individual differences in performance were noted particularly for the non-musically trained group. Furthermore, in different conditions we manipulated the spatial location and intelligibility of the masking sentences, thus changing the amount of 'informational masking' (IM) while keeping the amount of 'energetic masking' (EM) relatively constant. When the maskers were unintelligible and spatially separated from the target (low in IM), musicians and non-musicians performed comparably. These results suggest that the characteristics of speech maskers and the amount of IM can influence the magnitude of the differences found between musicians and non-musicians in multiple-talker "cocktail party" environments. Furthermore, considering the task in terms of the EM-IM distinction provides a conceptual framework for future behavioral and neuroscientific studies which explore the underlying sensory and cognitive mechanisms contributing to enhanced "speech-in-noise" perception by musicians.