Project description:Noise is a ubiquitous source of errors in all forms of communication [1]. Noise-induced errors in speech communication, for example, make it difficult for humans to converse in noisy social settings, a challenge aptly named the "cocktail party problem" [2]. Many nonhuman animals also communicate acoustically in noisy social groups and thus face biologically analogous problems [3]. However, we know little about how the perceptual systems of receivers are evolutionarily adapted to avoid the costs of noise-induced errors in communication. In this study of Cope's gray treefrog (Hyla chrysoscelis; Hylidae), we investigated whether receivers exploit a potential statistical regularity present in noisy acoustic scenes to reduce errors in signal recognition and discrimination. We developed an anatomical/physiological model of the peripheral auditory system to show that temporal correlation in amplitude fluctuations across the frequency spectrum ("comodulation") [4-6] is a feature of the noise generated by large breeding choruses of sexually advertising males. In four psychophysical experiments, we investigated whether females exploit comodulation in background noise to mitigate noise-induced errors in evolutionarily critical mate-choice decisions. Subjects experienced fewer errors in recognizing conspecific calls and in selecting the calls of high-quality mates in the presence of simulated chorus noise that was comodulated. These data show unequivocally, and for the first time, that exploiting statistical regularities present in noisy acoustic scenes is an important biological strategy for solving cocktail-party-like problems in nonhuman animal communication.
Project description:Categorization performance is a popular metric of scene recognition and understanding in behavioral and computational research. However, categorical constructs and their labels can be somewhat arbitrary. Derived from exhaustive vocabularies of place names (e.g., Deng et al., 2009), or the judgements of small groups of researchers (e.g., Fei-Fei, Iyer, Koch, & Perona, 2007), these categories may not correspond with human-preferred taxonomies. Here, we propose clustering by increasing the rand index via coordinate ascent (CIRCA): an unsupervised, data-driven clustering method for deriving ground-truth scene categories. In Experiment 1, human participants organized 80 stereoscopic images of outdoor scenes from the Southampton-York Natural Scenes (SYNS) dataset (Adams et al., 2016) into discrete categories. In separate tasks, images were grouped according to i) semantic content, ii) three-dimensional spatial structure, or iii) two-dimensional image appearance. Participants provided text labels for each group. Using the CIRCA method, we determined the most representative category structure and then derived category labels for each task/dimension. In Experiment 2, we found that these categories generalized well to a larger set of SYNS images, and new observers. In Experiment 3, we tested the relationship between our category systems and the spatial envelope model (Oliva & Torralba, 2001). Finally, in Experiment 4, we validated CIRCA on a larger, independent dataset of same-different category judgements. The derived category systems outperformed the SUN taxonomy (Xiao, Hays, Ehinger, Oliva, & Torralba, 2010) and an alternative clustering method (Greene, 2019). In summary, we believe this novel categorization method can be applied to a wide range of datasets to derive optimal categorical groupings and labels from psychophysical judgements of stimulus similarity.
Project description:Research shows that emotional stimuli can capture attention, and this can benefit or impair performance, depending on the characteristics of a task. Additionally, whilst some findings show that attention expands under positive conditions, others show that emotion has no influence on the broadening of attention. The current study investigated whether emotional real-world scenes influence attention in a visual search task. Participants were asked to identify a target letter embedded in the centre or periphery of emotional images. Identification accuracy was lower in positive images compared to neutral images, and response times were slower in negative images. This suggests that real-world emotional stimuli have a distracting effect on visual attention and search. There was no evidence that emotional images influenced the spatial spread of attention. Instead, it is suggested that findings may provide support for the argument that positive emotion encourages a global processing style and negative emotion promotes local processing.
Project description:Perceptual load is a well-established determinant of attentional engagement in a task. So far, perceptual load has typically been manipulated by increasing either the number of task-relevant items or the perceptual processing demand (e.g. conjunction versus feature tasks). The tasks used often involved rather simple visual displays (e.g. letters or single objects). How can perceptual load be operationalized for richer, real-world images? A promising proxy is the visual complexity of an image. However, current predictive models for visual complexity have limited applicability to diverse real-world images. Here we modelled visual complexity using a deep convolutional neural network (CNN) trained to learn perceived ratings of visual complexity. We presented 53 observers with 4000 images from the PASCAL VOC dataset, obtaining 75 020 2-alternative forced choice paired comparisons across observers. Image visual complexity scores were obtained using the TrueSkill algorithm. A CNN with weights pre-trained on an object recognition task predicted complexity ratings with r = 0.83. By contrast, feature-based models used in the literature, working on image statistics such as entropy, edge density and JPEG compression ratio, only achieved r = 0.70. Thus, our model offers a promising method to quantify the perceptual load of real-world scenes through visual complexity.
Project description:The complexity of natural scenes makes it challenging to experimentally study the mechanisms behind human gaze behavior when viewing dynamic environments. Historically, eye movements were believed to be driven primarily by space-based attention towards locations with salient features. Increasing evidence suggests, however, that visual attention does not select locations with high saliency but operates on attentional units given by the objects in the scene. We present a new computational framework to investigate the importance of objects for attentional guidance. This framework is designed to simulate realistic scanpaths for dynamic real-world scenes, including saccade timing and smooth pursuit behavior. Individual model components are based on psychophysically uncovered mechanisms of visual attention and saccadic decision-making. All mechanisms are implemented in a modular fashion with a small number of well-interpretable parameters. To systematically analyze the importance of objects in guiding gaze behavior, we implemented five different models within this framework: two purely spatial models, where one is based on low-level saliency and one on high-level saliency, two object-based models, with one incorporating low-level saliency for each object and the other one not using any saliency information, and a mixed model with object-based attention and selection but space-based inhibition of return. We optimized each model's parameters to reproduce the saccade amplitude and fixation duration distributions of human scanpaths using evolutionary algorithms. We compared model performance with respect to spatial and temporal fixation behavior, including the proportion of fixations exploring the background, as well as detecting, inspecting, and returning to objects. A model with object-based attention and inhibition, which uses saliency information to prioritize between objects for saccadic selection, leads to scanpath statistics with the highest similarity to the human data. This demonstrates that scanpath models benefit from object-based attention and selection, suggesting that object-level attentional units play an important role in guiding attentional processing.
Project description:Advances in digital technology mean that the creation of visually compelling photographic fakes is growing at an incredible speed. The prevalence of manipulated photos in our everyday lives invites an important, yet largely unanswered, question: Can people detect photo forgeries? Previous research using simple computer-generated stimuli suggests people are poor at detecting geometrical inconsistencies within a scene. We do not know, however, whether such limitations also apply to real-world scenes that contain common properties that the human visual system is attuned to processing. In two experiments we asked people to detect and locate manipulations within images of real-world scenes. Subjects demonstrated a limited ability to detect original and manipulated images. Furthermore, across both experiments, even when subjects correctly detected manipulated images, they were often unable to locate the manipulation. People's ability to detect manipulated images was positively correlated with the extent of disruption to the underlying structure of the pixels in the photo. We also explored whether manipulation type and individual differences were associated with people's ability to identify manipulations. Taken together, our findings show, for the first time, that people have poor ability to identify whether a real-world image is original or has been manipulated. The results have implications for professionals working with digital images in legal, media, and other domains.
Project description:Across the adult lifespan memory processes are subject to pronounced changes. Prior knowledge and expectations might critically shape functional differences; however, corresponding findings have remained ambiguous so far. Here, we chose a tailored approach to scrutinize how schema (in-)congruencies affect older and younger adults' memory for objects embedded in real-world scenes, a scenario close to everyday life memory demands. A sample of 23 older (52-81 years) and 23 younger adults (18-38 years) freely viewed 60 photographs of scenes in which target objects were included that were either congruent or incongruent with the given context information. After a delay, recognition performance for those objects was determined. In addition, recognized objects had to be matched to the scene context in which they were previously presented. While we found schema violations beneficial for object recognition across age groups, the advantage was significantly less pronounced in older adults. We moreover observed an age-related congruency bias for matching objects to their original scene context. Our findings support a critical role of predictive processes for age-related memory differences and indicate enhanced weighting of predictions with age. We suggest that recent predictive processing theories provide a particularly useful framework to elaborate on age-related functional vulnerabilities as well as stability.
Project description:Our knowledge about the world is represented not merely as a collection of concepts, but as an organized lexico-semantic network in which concepts can be linked by relations, such as "taxonomic" relations between members of the same stable category (e.g., cat and sheep), or association between entities that occur together or in the same context (e.g., sock and foot). To date, accounts of the origins of semantic organization have largely overlooked how sensitivity to statistical regularities ubiquitous in the environment may play a powerful role in shaping semantic development. The goal of the present research was to investigate how associations in the form of statistical regularities with which labels for concepts co-occur in language (e.g., sock and foot) and taxonomic relatedness (e.g., sock and pajamas) shape semantic organization of 4-5-year-olds and adults. To examine these aspects of semantic organization across development, we conducted three experiments examining effects of co-occurrence and taxonomic relatedness on cued recall (Experiment 1), word-picture matching (Experiment 2), and looking dynamics in a Visual World paradigm (Experiment 3). Taken together, the results of the three experiments provide evidence that co-occurrence-based links between concepts manifest in semantic organization from early childhood onward, and are increasingly supplemented by taxonomic links. We discuss these findings in relation to theories of semantic development.
Project description:Most objects show high degrees of spatial regularity (e.g. beach umbrellas appear above, not under, beach chairs). The spatial regularities of real-world objects benefit visual working memory (VWM), but the mechanisms behind this spatial regularity effect remain unclear. The "encoding specificity" hypothesis suggests that spatial regularity will enhance the visual encoding process but will not facilitate the integration of information online during VWM maintenance. The "perception-alike" hypothesis suggests that spatial regularity will function in both visual encoding and online integration during VWM maintenance. We investigated whether VWM integrates sequentially presented real-world objects by focusing on the existence of the spatial regularity effect. Throughout five experiments, we manipulated the presentation (simultaneous vs. sequential) and regularity (with vs. without regularity) of memory arrays among pairs of real-world objects. The spatial regularity of memory objects presented simultaneously, but not sequentially, improved VWM performance. We also examined whether memory load, verbal suppression and masking, and memory array duration hindered the spatial regularity effect in sequential presentation. We found a stable absence of the spatial regularity effect, suggesting that the participants were unable to integrate real-world objects based on spatial regularities online. Our results support the encoding specificity hypothesis, wherein the spatial regularity of real-world objects can enhance the efficiency of VWM encoding, but VWM cannot exploit spatial regularity to help organize sampled sequential information into meaningful integrations.
Project description:In virtually every real-life situation humans are confronted with complex and cluttered visual environments that contain a multitude of objects. Because of the limited capacity of the visual system, objects compete for neural representation and cognitive processing resources. Previous work has shown that such attentional competition is partly object based, such that competition among elements is reduced when these elements perceptually group into an object based on low-level cues. Here, using functional MRI (fMRI) and behavioral measures, we show that the attentional benefit of grouping extends to higher-level grouping based on the relative position of objects as experienced in the real world. An fMRI study designed to measure competitive interactions among objects in human visual cortex revealed reduced neural competition between objects when these were presented in commonly experienced configurations, such as a lamp above a table, relative to the same objects presented in other configurations. In behavioral visual search studies, we then related this reduced neural competition to improved target detection when distracter objects were shown in regular configurations. Control studies showed that low-level grouping could not account for these results. We interpret these findings as reflecting the grouping of objects based on higher-level spatial-relational knowledge acquired through a lifetime of seeing objects in specific configurations. This interobject grouping effectively reduces the number of objects that compete for representation and thereby contributes to the efficiency of real-world perception.