Bayesian motion estimation accounts for a surprising bias in 3D vision.
ABSTRACT: Determining the approach of a moving object is a vital survival skill that depends on the brain combining information about lateral translation and motion-in-depth. Given the importance of sensing motion for obstacle avoidance, it is surprising that humans make errors, reporting an object will miss them when it is on a collision course with their head. Here we provide evidence that biases observed when participants estimate movement in depth result from the brain's use of a "prior" favoring slow velocity. We formulate a Bayesian model for computing 3D motion using independently estimated parameters for the shape of the visual system's slow velocity prior. We demonstrate the success of this model in accounting for human behavior in separate experiments that assess both sensitivity and bias in 3D motion estimation. Our results show that a surprising perceptual error in 3D motion perception reflects the importance of prior probabilities when estimating environmental properties.
Project description:Many organisms and objects deform nonrigidly when moving, requiring perceivers to separate shape changes from object motions. Surprisingly, the abilities of observers to correctly infer nonrigid volumetric shapes from motion cues have not been measured, and structure from motion models predominantly use variants of rigidity assumptions. We show that observers are equally sensitive at discriminating cross-sections of flexing and rigid cylinders based on motion cues, when the cylinders are rotated simultaneously around the vertical and depth axes. A computational model based on motion perspective (i.e., assuming perceived depth is inversely proportional to local velocity) predicted the psychometric curves better than shape from motion factorization models using shape or trajectory basis functions. Asymmetric percepts of symmetric cylinders, arising because of asymmetric velocity profiles, provided additional evidence for the dominant role of relative velocity in shape perception. Finally, we show that inexperienced observers are generally incapable of using motion cues to detect inflation/deflation of rigid and flexing cylinders, but this handicap can be overcome with practice for both nonrigid and rigid shapes. The empirical and computational results of this study argue against the use of rigidity assumptions in extracting 3D shape from motion and for the primacy of motion deformations computed from motion shears.
Project description:An object moving through three-dimensional (3D) space typically yields different patterns of velocities in each eye. For an interocular velocity difference cue to be used, some instances of real 3D motion in the environment (e.g., when a moving object is partially occluded) would require an interocular velocity difference computation that operates on motion signals that are not only monocular (or eye specific) but also depend on each eye's two-dimensional (2D) direction being estimated over regions larger than the size of V1 receptive fields (i.e., global pattern motion). We investigated this possibility using 3D motion aftereffects (MAEs) with stimuli comprising many small, drifting Gabor elements. Conventional frontoparallel (2D) MAEs were local-highly sensitive to the test elements being in the same locations as the adaptor (Experiment 1). In contrast, 3D MAEs were robust to the test elements being in different retinal locations than the adaptor, indicating that 3D motion processing involves relatively global spatial pooling of motion signals (Experiment 2). The 3D MAEs were strong even when the local elements were in unmatched locations across the two eyes during adaptation, as well as when the adapting stimulus elements were randomly oriented, and specified global motion via the intersection of constraints (Experiment 3). These results bolster the notion of eye-specific computation of 2D pattern motion (involving global pooling of local, eye-specific motion signals) for the purpose of computing 3D motion, and highlight the idea that classically "late" computations such as pattern motion can be done in a manner that retains information about the eye of origin.
Project description:The retinal image changes that occur during locomotion, the optic flow, carry information about self-motion and the three-dimensional structure of the environment. Especially fast moving animals with only little binocular vision depend on these depth cues for maneuvering. They actively control their gaze to facilitate perception of depth based on cues in the optic flow. In the visual system of birds, nucleus rotundus neurons were originally found to respond to object motion but not to background motion. However, when background and object were both moving, responses increased the more the direction and velocity of object and background motion on the retina differed. These properties may play a role in representing depth cues in the optic flow. We therefore investigated, how neurons in nucleus rotundus respond to optic flow that contains depth cues. We presented simplified and naturalistic optic flow on a panoramic LED display while recording from single neurons in nucleus rotundus of anaesthetized zebra finches. Unlike most studies on motion vision in birds, our stimuli included depth information. We found extensive responses of motion selective neurons in nucleus rotundus to optic flow stimuli. Simplified stimuli revealed preferences for optic flow reflecting translational or rotational self-motion. Naturalistic optic flow stimuli elicited complex response modulations, but the presence of objects was signaled by only few neurons. The neurons that did respond to objects in the optic flow, however, show interesting properties.
Project description:The brain infers our spatial orientation and properties of the world from ambiguous and noisy sensory cues. Judging self-motion (heading) in the presence of independently moving objects poses a challenging inference problem because the image motion of an object could be attributed to movement of the object, self-motion, or some combination of the two. We test whether perception of heading and object motion follows predictions of a normative causal inference framework. In a dual-report task, subjects indicated whether an object appeared stationary or moving in the virtual world, while simultaneously judging their heading. Consistent with causal inference predictions, the proportion of object stationarity reports, as well as the accuracy and precision of heading judgments, depended on the speed of object motion. Critically, biases in perceived heading declined when the object was perceived to be moving in the world. Our findings suggest that the brain interprets object motion and self-motion using a causal inference framework.
Project description:This article presents a dataset with 4000 synthetic images portraying five 3D models from different viewpoints under varying lighting conditions. Depth of field and motion blur have also been used to generate realistic images. For each object, 8 scenes with different combinations of lighting, depth of field and motion blur are created and images are taken from 100 points of view. Data also includes information about camera intrinsic and extrinsic calibration parameters for each image as well as the ground truth geometry of the 3D models. The images were rendered using Blender. The aim of this dataset is to allow evaluation and comparison of different solutions for 3D reconstruction of objects starting from a set of images taken under different realistic acquisition setups.
Project description:The perception of depth from relative motion is believed to be a slow process that "builds-up" over a period of observation. However, in the case of motion parallax, the potential accuracy of the depth estimate suffers as the observer translates during the viewing period. Our recent quantitative model for the perception of depth from motion parallax proposes that relative object depth (d) can be determined from retinal image motion (d?/dt), pursuit eye movement (d?/dt), and fixation distance (f) by the formula: d/f?d?/d?. Given the model's dynamics, it is important to know the integration time required by the visual system to recover d? and d?, and then estimate d. Knowing the minimum integration time reveals the incumbent error in this process. A depth-phase discrimination task was used to determine the time necessary to perceive depth-sign from motion parallax. Observers remained stationary and viewed a briefly translating random-dot motion parallax stimulus. Stimulus duration varied between trials. Fixation on the translating stimulus was monitored and enforced with an eye-tracker. The study found that relative depth discrimination can be performed with presentations as brief as 16.6 ms, with only two stimulus frames providing both retinal image motion and the stimulus window motion for pursuit (mean range=16.6-33.2 ms). This was found for conditions in which, prior to stimulus presentation, the eye was engaged in ongoing pursuit or the eye was stationary. A large high-contrast masking stimulus disrupted depth-discrimination for stimulus presentations less than 70-75 ms in both pursuit and stationary conditions. This interval might be linked to ocular-following response eye-movement latencies. We conclude that neural mechanisms serving depth from motion parallax generate a depth estimate much more quickly than previously believed. We propose that additional sluggishness might be due to the visual system's attempt to determine the maximum d?/d? ratio for a selection of points on a complicated stimulus.
Project description:Any processing delay between the two eyes can result in illusory 3D percepts for moving objects because of either changes in the pure disparities over time for disparity sensors or by changes to sensors that encode motion/disparity conjointly. This is demonstrated by viewing a fronto-parallel pendulum through a neutral density (ND) filter placed over one eye, resulting in the illusory 3D percept of the pendulum following an elliptical orbit in depth, the so-called Pulfrich phenomenon. Here we use a paradigm where a cylinder rotating in depth, defined by moving Gabor patches is presented at different interocular phases, generating strong to ambiguous depth percepts. This paradigm allows one to manipulate independently the contrast and the luminance of the patches to determine their influence on perceived motion-in-depth. Thus we show psychophysically that an interocular contrast difference can itself result in a similar illusory 3D percept of motion-in-depth. We argue that contrast, like luminance (ND filter) can modify the dynamics of visual neurons resulting in an interocular processing delay or an interocular velocity difference.
Project description:Some animals with lateral eyes (such as bees) control their navigation through the 3D world using velocity differences between the two eyes. Other animals with frontal eyes (such as primates, including humans) can perceive 3D motion based on the different velocities that a moving object projects upon the two retinae. Although one type of 3D motion perception involves a comparison between velocities from vastly different (monocular) portions of the visual field, and the other involves a comparison within overlapping (binocular) portions of the visual field, both compare velocities across the two eyes. Here we asked whether human interocular velocity comparisons, typically studied in the context of binocularly overlapping vision, operate in the far lateral (and hence, monocular) periphery and, if so, whether these comparisons were accordant with conventional interocular motion processing. We found that speed discrimination was indeed better between the two eyes' monocular visual fields, as compared to within a single eye's (monocular) visual field, but only when the velocities were consistent with commonly encountered motion. This intriguing finding suggests that mechanisms sensitive to relative motion information on opposite sides of an animal may have been retained, or at some point independently achieved, as the eyes became frontal in some animals.
Project description:Commercially available depth sensing devices are primarily designed for domains that are either macroscopic, or static. We develop a solution for fast microscale 3D reconstruction, using off-the-shelf components. By the addition of lenses, precise calibration of camera internals and positioning, and development of bespoke software, we turn an infrared depth sensor designed for human-scale motion and object detection into a device with mm-level accuracy capable of recording at up to 30Hz.
Project description:Many everyday interactions with moving objects benefit from an accurate perception of their movement. Self-motion, however, complicates object motion perception because it generates a global pattern of motion on the observer's retina and radically influences an object's retinal motion. There is strong evidence that the brain compensates by suppressing the retinal motion due to self-motion, however, this requires estimates of depth relative to the object-otherwise the appropriate self-motion component to remove cannot be determined. The underlying neural mechanisms are unknown, but neurons in brain areas MT and MST may contribute given their sensitivity to motion parallax and depth through joint direction, speed, and disparity tuning. We developed a neural model to investigate whether cells in areas MT and MST with well-established neurophysiological properties can account for human object motion judgments during self-motion. We tested the model by comparing simulated object motion signals to human object motion judgments in environments with monocular, binocular, and ambiguous depth. Our simulations show how precise depth information, such as that from binocular disparity, may improve estimates of the retinal motion pattern due the self-motion through increased selectivity among units that respond to the global self-motion pattern. The enhanced self-motion estimates emerged from recurrent feedback connections in MST and allowed the model to better suppress the appropriate direction, speed, and disparity signals from the object's retinal motion, improving the accuracy of the object's movement direction represented by motion signals.