Generalization of multisensory perceptual learning.
Ontology highlight
ABSTRACT: Life in a multisensory world requires the rapid and accurate integration of stimuli across the different senses. In this process, the temporal relationship between stimuli is critical in determining which stimuli share a common origin. Numerous studies have described a multisensory temporal binding window-the time window within which audiovisual stimuli are likely to be perceptually bound. In addition to characterizing this window's size, recent work has shown it to be malleable, with the capacity for substantial narrowing following perceptual training. However, the generalization of these effects to other measures of perception is not known. This question was examined by characterizing the ability of training on a simultaneity judgment task to influence perception of the temporally-dependent sound-induced flash illusion (SIFI). Results do not demonstrate a change in performance on the SIFI itself following training. However, data do show an improved ability to discriminate rapidly-presented two-flash control conditions following training. Effects were specific to training and scaled with the degree of temporal window narrowing exhibited. Results do not support generalization of multisensory perceptual learning to other multisensory tasks. However, results do show that training results in improvements in visual temporal acuity, suggesting a generalization effect of multisensory training on unisensory abilities.
Project description:We investigated the origin of two previously reported general rules of perceptual learning. First, the initial discrimination thresholds and the amount of learning were found to be related through a Weber-like law. Second, increased training length negatively influenced the observer's ability to generalize the obtained knowledge to a new context. Using a five-day training protocol, separate groups of observers were trained to perform discrimination around two different reference values of either contrast (73% and 30%) or orientation (25° and 0°). In line with previous research, we found a Weber-like law between initial performance and the amount of learning, regardless of whether the tested attribute was contrast or orientation. However, we also showed that this relationship directly reflected observers' perceptual scaling function relating physical intensities to perceptual magnitudes, suggesting that participants learned equally on their internal perceptual space in all conditions. In addition, we found that with the typical five-day training period, the extent of generalization was proportional to the amount of learning, seemingly contradicting the previously reported diminishing generalization with practice. This result suggests that the negative link between generalization and the length of training found in earlier studies might have been due to overfitting after longer training and not directly due to the amount of learning per se.
Project description:To accurately represent the environment, our brains must integrate sensory signals from a common source while segregating those from independent sources. A reasonable strategy for performing this task is to restrict integration to cues that coincide in space and time. However, because multisensory signals are subject to differential transmission and processing delays, the brain must retain a degree of tolerance for temporal discrepancies. Recent research suggests that the width of this 'temporal binding window' can be reduced through perceptual learning, however, little is known about the mechanisms underlying these experience-dependent effects. Here, in separate experiments, we measure the temporal and spatial binding windows of human participants before and after training on an audiovisual temporal discrimination task. We show that training leads to two distinct effects on multisensory integration in the form of (i) a specific narrowing of the temporal binding window that does not transfer to spatial binding and (ii) a general reduction in the magnitude of crossmodal interactions across all spatiotemporal disparities. These effects arise naturally from a Bayesian model of causal inference in which learning improves the precision of audiovisual timing estimation, whilst concomitantly decreasing the prior expectation that stimuli emanate from a common source.
Project description:While it is commonly held that the capacity to learn is greatest in the young, there have been few direct comparisons of the response to training across age groups. Here, adolescents (11-17 years, n = 20) and adults (≥18 years, n = 11) practiced detecting a backward-masked tone for ∼1 h/day for 10 days. Nearly every adult, but only half of the adolescents improved across sessions, and the adolescents who learned did so more slowly than adults. Nevertheless, the adolescent and adult learners showed the same generalization pattern, improving on untrained backward- but not forward- or simultaneous-masking conditions. Another subset of adolescents (n = 6) actually got worse on the trained condition. This worsening, unlike learning, generalized to an untrained forward-masking, but not backward-masking condition. Within sessions, both age groups got worse, but the worsening was greater for adolescents. These maturational changes in the response to training largely followed those previously reported for temporal-interval discrimination. Overall, the results suggest that late-maturing processes affect the response to perceptual training and that some of these processes may be shared between tasks. Further, the different developmental rates for learning and generalization, and different generalization patterns for learning and worsening imply that learning, generalization, and worsening may have different origins.
Project description:Fluctuations in sound amplitude provide important cues to the identity of many sounds including speech. Of interest here was whether the ability to detect these fluctuations can be improved with practice, and if so whether this learning generalizes to untrained cases. To address these issues, normal-hearing adults (n = 9) were trained to detect sinusoidal amplitude modulation (SAM; 80-Hz rate, 3-4 kHz bandpass carrier) 720 trials/day for 6-7 days and were tested before and after training on related SAM-detection and SAM-rate-discrimination conditions. Controls (n = 9) only participated in the pre- and post-tests. The trained listeners improved more than the controls on the trained condition between the pre- and post-tests, but different subgroups of trained listeners required different amounts of practice to reach asymptotic performance, ranging from 1 (n = 6) to 4-6 (n = 3) sessions. This training-induced learning did not generalize to detection with two untrained carrier spectra (5 kHz low-pass and 0.5-1.5 kHz bandpass) or to rate discrimination with the trained rate and carrier spectrum, but there was some indication that it generalized to detection with two untrained rates (30 and 150 Hz). Thus, practice improved the ability to detect amplitude modulation, but the generalization of this learning to untrained cases was somewhat limited.
Project description:The present study investigated the degree to which perceptual adaptation to foreign-accented speech is specific to the regularities in pronunciation associated with a particular accent. Across experiments, the conditions under which generalization of learning did or did not occur were evaluated. In Experiment 1, listeners trained on word-length utterances in Korean-accented English and tested with words produced by the same or different set of Korean-accented speakers. Listeners performed better than untrained controls when tested with novel words from the same or different speakers. In Experiment 2, listeners were trained with Spanish-, Korean-, or mixed-accented speech and transcribed novel words produced by unfamiliar Korean- or Spanish-accented speakers at test. The findings revealed relative specificity of learning. Listeners trained and tested on the same variety of accented speech showed better transcription at test than those trained with a different accent or untrained controls. Performance after mixed-accent training was intermediate. Patterns of errors and analysis of acoustic properties for accented vowels suggested perceptual improvement for regularities arising from each accent, with learning dependent on the relative similarity of linguistic form within and across accents.
Project description:Generalizing from past experiences to novel situations is critical for adaptive behavior, whereas overgeneralization can promote maladaptive responses (e.g., context-inappropriate fear in anxiety). Here, we propose that overgeneralizing alcohol-related associations characterizes risky drinking. We conducted two online experiments assessing generalization of alcohol-related gains (Study 1) and losses (Study 2) among individuals who engaged in light or risky patterns of drinking (Study 1: N = 88, 24-44 years old; Study 2: N = 87, 21-44 years old). After learning to associate cards with alcohol and non-alcohol-related outcomes, participants chose whether to play with cards varying in perceptual similarity to those shown during conditioning. Finally, participants completed a surprise recognition memory test for all outcomes. Although both groups showed comparable conditioning, we found that risky drinkers overgeneralized alcohol-related gains and losses. Risky drinkers also showed a bias toward recognizing alcohol-related images. These results indicate a novel role for overgeneralization of alcohol-related gains and losses as a mechanism associated with risky drinking.
Project description:Multisensory stimuli speed behavioral responses, but the mechanisms subserving these effects remain disputed. Historically, the observation that multisensory reaction times (RTs) outpace models assuming independent sensory channels has been taken as evidence for multisensory integration (the "redundant target effect"; RTE). However, this interpretation has been challenged by alternative explanations based on stimulus sequence effects, RT variability, and/or negative correlations in unisensory processing. To clarify the mechanisms subserving the RTE, we collected RTs from 78 undergraduates in a multisensory simple RT task. Based on previous neurophysiological findings, we hypothesized that the RTE was unlikely to reflect these alternative mechanisms, and more likely reflected pre-potentiation of sensory responses through crossmodal phase-resetting. Contrary to accounts based on stimulus sequence effects, we found that preceding stimuli explained only 3-9% of the variance in apparent RTEs. Comparing three plausible evidence accumulator models, we found that multisensory RT distributions were best explained by increased sensory evidence at stimulus onset. Because crossmodal phase-resetting increases cortical excitability before sensory input arrives, these results are consistent with a mechanism based on pre-potentiation through phase-resetting. Mathematically, this model entails increasing the prior log-odds of stimulus presence, providing a potential link between neurophysiological, behavioral, and computational accounts of multisensory interactions.
Project description:Naturally occurring signals in audition and touch can be complex and marked by temporal variations in frequency and amplitude. Auditory frequency sweep processing has been studied extensively; however, much less is known about sweep processing in touch because studies have primarily focused on the perception of simple sinusoidal vibrations. Given the extensive interactions between audition and touch in the frequency processing of pure tone signals, we reasoned that these senses might also interact in the processing of higher-order frequency representations like sweeps. In a series of psychophysical experiments, we characterized the influence of auditory distractors on the ability of participants to discriminate tactile frequency sweeps. Auditory frequency sweeps systematically biased the tactile perception of sweep direction. Importantly, auditory cues exerted little influence on tactile sweep direction perception when the sounds and vibrations occupied different absolute frequency ranges or when the sounds consisted of intensity sweeps. Thus, audition and touch interact in frequency sweep perception in a frequency- and feature-specific manner. Our results demonstrate that audio-tactile interactions are not constrained to the processing of simple sinusoids. Because higher-order frequency representations may be synthesized from simpler representations, our findings imply that multisensory interactions in the temporal frequency domain span multiple hierarchical levels in sensory processing. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Project description:The brain's ability to bind incoming auditory and visual stimuli depends critically on the temporal structure of this information. Specifically, there exists a temporal window of audiovisual integration within which stimuli are highly likely to be bound together and perceived as part of the same environmental event. Several studies have described the temporal bounds of this window, but few have investigated its malleability. Here, the plasticity in the size of this temporal window was investigated using a perceptual learning paradigm in which participants were given feedback during a two-alternative forced choice (2-AFC) audiovisual simultaneity judgment task. Training resulted in a marked (i.e., approximately 40%) narrowing in the size of the window. To rule out the possibility that this narrowing was the result of changes in cognitive biases, a second experiment using a two-interval forced choice (2-IFC) paradigm was undertaken during which participants were instructed to identify a simultaneously presented audiovisual pair presented within one of two intervals. The 2-IFC paradigm resulted in a narrowing that was similar in both degree and dynamics to that using the 2-AFC approach. Together, these results illustrate that different methods of multisensory perceptual training can result in substantial alterations in the circuits underlying the perception of audiovisual simultaneity. These findings suggest a high degree of flexibility in multisensory temporal processing and have important implications for interventional strategies that may be used to ameliorate clinical conditions (e.g., autism, dyslexia) in which multisensory temporal function may be impaired.
Project description:Sensory information about the state of the world is generally ambiguous. Understanding how the nervous system resolves such ambiguities to infer the actual state of the world is a central quest for sensory neuroscience. However, the computational principles of perceptual disambiguation are still poorly understood: What drives perceptual decision-making between multiple equally valid solutions? Here we investigate how humans gather and combine sensory information-within and across modalities-to disambiguate motion perception in an ambiguous audiovisual display, where two moving stimuli could appear as either streaming through, or bouncing off each other. By combining psychophysical classification tasks with reverse correlation analyses, we identified the particular spatiotemporal stimulus patterns that elicit a stream or a bounce percept, respectively. From that, we developed and tested a computational model for uni- and multi-sensory perceptual disambiguation that tightly replicates human performance. Specifically, disambiguation relies on knowledge of prototypical bouncing events that contain characteristic patterns of motion energy in the dynamic visual display. Next, the visual information is linearly integrated with auditory cues and prior knowledge about the history of recent perceptual interpretations. What is more, we demonstrate that perceptual decision-making with ambiguous displays is systematically driven by noise, whose random patterns not only promote alternation, but also provide signal-like information that biases perception in highly predictable fashion.