Neuronal Correlates of Informational and Energetic Masking in the Human Brain in a Multi-Talker Situation.
ABSTRACT: Human listeners can follow the voice of one speaker while several others are talking at the same time. This process requires segregating the speech streams from each other and continuously directing attention to the target stream. We investigated the functional brain networks underlying this ability. Two speech streams were presented simultaneously to participants, who followed one of them and detected targets within it (target stream). The loudness of the distractor speech stream varied on five levels: moderately softer, slightly softer, equal, slightly louder, or moderately louder than the attended. Performance measures showed that the most demanding task was the moderately softer distractors condition, which indicates that a softer distractor speech may receive more covert attention than louder distractors and, therefore, they require more cognitive resources. EEG-based measurement of functional connectivity between various brain regions revealed frequency-band specific networks: (1) energetic masking (comparing the louder distractor conditions with the equal loudness condition) was predominantly associated with stronger connectivity between the frontal and temporal regions at the lower alpha (8-10 Hz) and gamma (30-70 Hz) bands; (2) informational masking (comparing the softer distractor conditions with the equal loudness condition) was associated with a distributed network between parietal, frontal, and temporal regions at the theta (4-8 Hz) and beta (13-30 Hz) bands. These results suggest the presence of distinct cognitive and neural processes for solving the interference from energetic vs. informational masking.
Project description:With a cochlear implant, when stimulation from multiple channels is interleaved, the perceived loudness is greater than the loudness associated with any of the individual channels presented in isolation. This phenomenon is known as loudness summation. This study examined if loudness summation with monopolar and tripolar stimulation were equivalent at two loudnesses and two spacing configurations. Results suggest that loudness summation is similar for monopolar and tripolar modes. However, larger summation differences were observed for softer sounds and louder sounds with a larger spatial separation. The results are consistent with the idea that loudness summation is dependent on channel interaction and have implications for implementing current-focused processing strategies.
Project description:Preliminary data [M. Epstein and M. Florentine, Ear. Hear. 30, 234-237 (2009)] obtained using speech stimuli from a visually present talker heard via loudspeakers in a sound-attenuating chamber indicate little difference in loudness when listening with one or two ears (i.e., significantly reduced binaural loudness summation, BLS), which is known as "binaural loudness constancy." These data challenge current understanding drawn from laboratory measurements that indicate a tone presented binaurally is louder than the same tone presented monaurally. Twelve normal listeners were presented recorded spondees, monaurally and binaurally across a wide range of levels via earphones and a loudspeaker with and without visual cues. Statistical analyses of binaural-to-monaural ratios of magnitude estimates indicate that the amount of BLS is significantly less for speech presented via a loudspeaker with visual cues than for stimuli with any other combination of test parameters (i.e., speech presented via earphones or a loudspeaker without visual cues, and speech presented via earphones with visual cues). These results indicate that the loudness of a visually present talker in daily environments is little affected by switching between binaural and monaural listening. This supports the phenomenon of binaural loudness constancy and underscores the importance of ecological validity in loudness research.
Project description:This research studied whether the mode of input (auditory versus audiovisual) influenced semantic access by speech in children with sensorineural hearing impairment (HI).Participants, 31 children with HI and 62 children with normal hearing (NH), were tested with the authors' new multimodal picture word task. Children were instructed to name pictures displayed on a monitor and ignore auditory or audiovisual speech distractors. The semantic content of the distractors was varied to be related versus unrelated to the pictures (e.g., picture distractor of dog-bear versus dog-cheese, respectively). In children with NH, picture-naming times were slower in the presence of semantically related distractors. This slowing, called semantic interference, is attributed to the meaning-related picture-distractor entries competing for selection and control of the response (the lexical selection by competition hypothesis). Recently, a modification of the lexical selection by competition hypothesis, called the competition threshold (CT) hypothesis, proposed that (1) the competition between the picture-distractor entries is determined by a threshold, and (2) distractors with experimentally reduced fidelity cannot reach the CT. Thus, semantically related distractors with reduced fidelity do not produce the normal interference effect, but instead no effect or semantic facilitation (faster picture naming times for semantically related versus unrelated distractors). Facilitation occurs because the activation level of the semantically related distractor with reduced fidelity (1) is not sufficient to exceed the CT and produce interference but (2) is sufficient to activate its concept, which then strengthens the activation of the picture and facilitates naming. This research investigated whether the proposals of the CT hypothesis generalize to the auditory domain, to the natural degradation of speech due to HI, and to participants who are children. Our multimodal picture word task allowed us to (1) quantify picture naming results in the presence of auditory speech distractors and (2) probe whether the addition of visual speech enriched the fidelity of the auditory input sufficiently to influence results.In the HI group, the auditory distractors produced no effect or a facilitative effect, in agreement with proposals of the CT hypothesis. In contrast, the audiovisual distractors produced the normal semantic interference effect. Results in the HI versus NH groups differed significantly for the auditory mode, but not for the audiovisual mode.This research indicates that the lower fidelity auditory speech associated with HI affects the normalcy of semantic access by children. Further, adding visual speech enriches the lower fidelity auditory input sufficiently to produce the semantic interference effect typical of children with NH.
Project description:This study examines the lexical representation and processing of noun-noun compounds and their grammatical gender during speech production in German, a language that codes for grammatical gender (masculine, feminine, and neuter). Using a picture-word interference paradigm, participants produced determiner-compound noun phrases in response to pictures, while ignoring written distractor words. Compound targets were either semantically transparent (e.g., birdhouse) or opaque (e.g., hotdog), and their constituent nouns either had the same or a different gender (internal gender match). Effects of gender-congruent but otherwise unrelated distractor nouns, and of two morphologically related distractors corresponding to the first or second constituent were assessed relative to a completely unrelated, gender-incongruent distractor baseline. Both constituent distractors strongly facilitated compound naming, and these effects were independent of the targets' semantic transparency. This supports retrieval of constituent morphemes for semantically transparent and opaque compounds during speech production. Furthermore, gender congruency between compounds and distractors did not speed up naming in general, but interacted with gender match of the compounds' constituent nouns, and their semantic transparency. A significant gender-congruency effect was obtained with semantically transparent compounds, consisting of two constituent nouns of the same gender, only. In principle, this pattern is compatible with a multiple lemma representation account for semantically transparent, but not for opaque compounds. The data also fit with a more parsimonious, holistic representation for all compounds at the lemma level, when differences in co-activation patterns for semantically transparent and opaque compounds are considered.
Project description:This study investigated the nature of the interference effect of semantically related distractors in the picture-word interference paradigm, which has been claimed to be caused by either competition between lexical representations of target and distractor or by a late response exclusion mechanism that removes the distractor from a response buffer. EEG was recorded while participants overtly named pictures accompanied by categorically related versus unrelated written distractor words. In contrast to previous studies, stimuli were presented for only 250 ms to avoid any re-processing. ERP effects of relatedness were found around 290, 470, 540, and 660 ms post stimulus onset. In addition, related distractors led to an increase in midfrontal theta power, especially from about 440 to 540 ms, as well as to decreased high beta power between 40 and 110 ms and increased high beta power between 275 and 340 ms post stimulus onset. Response-locked analyses showed no differences in ERPs, however increased low and high beta power for related distractors in various time windows, most importantly a high beta power increase between -175 and -155 ms before speech onset. These results suggest that the semantic distractor effect is a combination of various effects and that the lexical competition account and the response exclusion account each capture a part, but not all aspects of the effect.
Project description:When selectively attending to a speech stream in multi-talker scenarios, low-frequency cortical activity is known to synchronize selectively to fluctuations in the attended speech signal. Older listeners with age-related sensorineural hearing loss (presbycusis) often struggle to understand speech in such situations, even when wearing a hearing aid. Yet, it is unclear whether a peripheral hearing loss degrades the attentional modulation of cortical speech tracking. Here, we used psychoacoustics and electroencephalography (EEG) in male and female human listeners to examine potential effects of hearing loss on EEG correlates of speech envelope synchronization in cortex. Behaviorally, older hearing-impaired (HI) listeners showed degraded speech-in-noise recognition and reduced temporal acuity compared with age-matched normal-hearing (NH) controls. During EEG recordings, we used a selective attention task with two spatially separated simultaneous speech streams where NH and HI listeners both showed high speech recognition performance. Low-frequency (<10 Hz) envelope-entrained EEG responses were enhanced in the HI listeners, both for the attended speech, but also for tone sequences modulated at slow rates (4 Hz) during passive listening. Compared with the attended speech, responses to the ignored stream were found to be reduced in both HI and NH listeners, allowing for the attended target to be classified from single-trial EEG data with similar high accuracy in the two groups. However, despite robust attention-modulated speech entrainment, the HI listeners rated the competing speech task to be more difficult. These results suggest that speech-in-noise problems experienced by older HI listeners are not necessarily associated with degraded attentional selection.SIGNIFICANCE STATEMENT People with age-related sensorineural hearing loss often struggle to follow speech in the presence of competing talkers. It is currently unclear whether hearing impairment may impair the ability to use selective attention to suppress distracting speech in situations when the distractor is well segregated from the target. Here, we report amplified envelope-entrained cortical EEG responses to attended speech and to simple tones modulated at speech rates (4 Hz) in listeners with age-related hearing loss. Critically, despite increased self-reported listening difficulties, cortical synchronization to speech mixtures was robustly modulated by selective attention in listeners with hearing loss. This allowed the attended talker to be classified from single-trial EEG responses with high accuracy in both older hearing-impaired listeners and age-matched normal-hearing controls.
Project description:Anticipatory states help prioritise relevant perceptual targets over competing distractor stimuli and amplify early brain responses to these targets. Here we combine electroencephalography recordings in humans with multivariate stimulus decoding to address whether anticipation also increases the amount of target identity information contained in these responses, and to ask how targets are prioritised over distractors when these compete in time. We show that anticipatory cues not only boost visual target representations, but also delay the interference on these target representations caused by temporally adjacent distractor stimuli-possibly marking a protective window reserved for high-fidelity target processing. Enhanced target decoding and distractor resistance are further predicted by the attenuation of posterior 8-14 Hz alpha oscillations. These findings thus reveal multiple mechanisms by which anticipatory states help prioritise targets from temporally competing distractors, and they highlight the potential of non-invasive multivariate electrophysiology to track cognitive influences on perception in temporally crowded contexts.
Project description:Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0 dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures.
Project description:Rapid visual flicker is known to capture attention. Here we show slow flicker can also capture attention under reciprocal temporal conditions. Observers searched for a target line (vertical or horizontal) among tilted distractors. Distractor lines were surrounded by luminance modulating annuli, all flickering sinusoidally at 1.3 or 12.1?Hz, while the target's annulus flickered at frequencies within this range. Search times improved with increasing target/distractor frequency differences. For target-distractor frequency separations >5?Hz reaction times were minimal with high-frequency targets correctly identified more rapidly than low frequency targets (~400?ms). Critically, however, at these optimal frequency separations search times for low and high-frequency targets were unaffected by set size (slow flicker popped out from high flicker, and vice versa), indicating parallel and symmetric search performance when searching for high or low frequency targets. In a "cost" experiment using 1.3 and 12.1?Hz flicker, the unique flickering annulus sometimes surrounded a distractor and, on other trials, surrounded the target. When centered on a distractor, the unique frequency produced a clear and symmetrical search cost. Together, these symmetric pop-out and search costs demonstrate that temporal frequency is a pre-attentive visual feature capable of capturing attention, and that it is relative rather than absolute frequencies that are critical. The shape of the search functions strongly suggest that early visual temporal frequency filters underlie these effects.
Project description:The present study investigated the interplay between selective inhibition (the ability to suppress specific competing responses) and nonselective inhibition (the ability to suppress any inappropriate response) during single word production. To this end, we combined two well-established research paradigms: the picture-word interference task and the stop-signal task. Selective inhibition was assessed by instructing participants to name target pictures (e.g., dog) in the presence of semantically related (e.g., cat) or unrelated (e.g., window) distractor words. Nonselective inhibition was tested by occasionally presenting a visual stop-signal, indicating that participants should withhold their verbal response. The stop-signal was presented early (250 ms) aimed at interrupting the lexical selection stage, and late (325 ms) to influence the word-encoding stage of the speech production process. We found longer naming latencies for pictures with semantically related distractors than with unrelated distractors (semantic interference effect). The results further showed that, at both delays, stopping latencies (i.e., stop-signal RTs) were prolonged for naming pictures with semantically related distractors compared to pictures with unrelated distractors. Taken together, our findings suggest that selective and nonselective inhibition, at least partly, share a common inhibitory mechanism during different stages of the speech production process.