Visual Perception-Based Statistical Modeling of Complex Grain Image for Product Quality Monitoring and Supervision on Assembly Production Line.
ABSTRACT: Computer vision as a fast, low-cost, noncontact, and online monitoring technology has been an important tool to inspect product quality, particularly on a large-scale assembly production line. However, the current industrial vision system is far from satisfactory in the intelligent perception of complex grain images, comprising a large number of local homogeneous fragmentations or patches without distinct foreground and background. We attempt to solve this problem based on the statistical modeling of spatial structures of grain images. We present a physical explanation in advance to indicate that the spatial structures of the complex grain images are subject to a representative Weibull distribution according to the theory of sequential fragmentation, which is well known in the continued comminution of ore grinding. To delineate the spatial structure of the grain image, we present a method of multiscale and omnidirectional Gaussian derivative filtering. Then, a product quality classifier based on sparse multikernel-least squares support vector machine is proposed to solve the low-confidence classification problem of imbalanced data distribution. The proposed method is applied on the assembly line of a food-processing enterprise to classify (or identify) automatically the production quality of rice. The experiments on the real application case, compared with the commonly used methods, illustrate the validity of our method.
Project description:Previous studies have demonstrated a qualitative relationship between stone fragility and internal stone morphology. The goal of this study was to quantify morphologic features from dual-energy computed tomography (CT) images and assess their relationship to stone fragility.Thirty-three calcified urinary stones were scanned with micro-CT. Next, they were placed within torso-shaped water phantoms and scanned with the dual-energy CT stone composition protocol in routine use at our institution. Mixed low- and high-energy images were used to measure volume, surface roughness, and 12 metrics describing internal morphology for each stone. The ratios of low- to high-energy CT numbers were also measured. Subsequent to imaging, stone fragility was measured by disintegrating each stone in a controlled ex vivo experiment using an ultrasonic lithotripter and recording the time to comminution. A multivariable linear regression model was developed to predict time to comminution.The average stone volume was 300?mm3 (range: 134-674?mm3). The average comminution time measured ex vivo was 32 seconds (range: 7-115 seconds). Stone volume, dual-energy CT number ratio, and surface roughness were found to have the best combined predictive ability to estimate comminution time (adjusted R2?=?0.58). The predictive ability of mixed dual-energy CT images, without use of the dual-energy CT number ratio, to estimate comminution time was slightly inferior, with an adjusted R2 of 0.54.Dual-energy CT number ratios, volume, and morphologic metrics may provide a method for predicting stone fragility, as measured by time to comminution from ultrasonic lithotripsy.
Project description:A normally sighted person can see a grating of 30 cycles per degree or higher, but spatial frequencies needed for motion perception are much lower than that. It is unknown for natural images with a wide spectrum how all the visible spatial frequencies contribute to motion speed perception. In this work, we studied the effect of spatial frequency content on motion speed estimation for sequences of natural and stochastic pixel images by simulating different visual conditions, including normal vision, low vision (low-pass filtering), and complementary vision (high-pass filtering at the same cutoff frequencies of the corresponding low-vision conditions) conditions. Speed was computed using a biological motion energy-based computational model. In natural sequences, there was no difference in speed estimation error between normal vision and low vision conditions, but it was significantly higher for complementary vision conditions (containing only high-frequency components) at higher speeds. In stochastic sequences that had a flat frequency distribution, the error in normal vision condition was significantly larger compared with low vision conditions at high speeds. On the contrary, such a detrimental effect on speed estimation accuracy was not found for low spatial frequencies. The simulation results were consistent with the motion direction detection task performed by human observers viewing stochastic sequences. Together, these results (i) reiterate the importance of low frequencies in motion perception, and (ii) indicate that high frequencies may be detrimental for speed estimation when low frequency content is weak or not present.
Project description:PURPOSE:Visibility of low-spatial frequency stimuli improves when their contrast is modulated at 5 to 10 Hz compared with stationary stimuli. Therefore, temporal modulations of visual objects could enhance the performance of low vision patients who primarily perceive images of low-spatial frequency content. We investigated the effect of retinal-image jitter on word recognition speed and facial emotion recognition in subjects with central visual impairment. METHODS:Word recognition speed and accuracy of facial emotion discrimination were measured in volunteers with AMD under stationary and jittering conditions. Computer-driven and optoelectronic approaches were used to induce retinal-image jitter with duration of 100 or 166 ms and amplitude within the range of 0.5 to 2.6° visual angle. Word recognition speed was also measured for participants with simulated (Bangerter filters) visual impairment. RESULTS:Text jittering markedly enhanced word recognition speed for people with severe visual loss (101 ± 25%), while for those with moderate visual impairment, this effect was weaker (19 ± 9%). The ability of low vision patients to discriminate the facial emotions of jittering images improved by a factor of 2. A prototype of optoelectronic jitter goggles produced similar improvement in facial emotion discrimination. Word recognition speed in participants with simulated visual impairment was enhanced for interjitter intervals over 100 ms and reduced for shorter intervals. CONCLUSIONS:Results suggest that retinal-image jitter with optimal frequency and amplitude is an effective strategy for enhancing visual information processing in the absence of spatial detail. These findings will enable the development of novel tools to improve the quality of life of low vision patients.
Project description:Defocus blur is nearly always present in natural images: Objects at only one distance can be perfectly focused. Images of objects at other distances are blurred by an amount depending on pupil diameter and lens properties. Despite the fact that defocus is of great behavioral, perceptual, and biological importance, it is unknown how biological systems estimate defocus. Given a set of natural scenes and the properties of the vision system, we show from first principles how to optimally estimate defocus at each location in any individual image. We show for the human visual system that high-precision, unbiased estimates are obtainable under natural viewing conditions for patches with detectable contrast. The high quality of the estimates is surprising given the heterogeneity of natural images. Additionally, we quantify the degree to which the sign ambiguity often attributed to defocus is resolved by monochromatic aberrations (other than defocus) and chromatic aberrations; chromatic aberrations fully resolve the sign ambiguity. Finally, we show that simple spatial and spatio-chromatic receptive fields extract the information optimally. The approach can be tailored to any environment-vision system pairing: natural or man-made, animal or machine. Thus, it provides a principled general framework for analyzing the psychophysics and neurophysiology of defocus estimation in species across the animal kingdom and for developing optimal image-based defocus and depth estimation algorithms for computational vision systems.
Project description:Melanopsin photoreception enhances retinal responses to variations in ambient light (irradiance) and drives non-image-forming visual reflexes such as circadian entrainment [1-6]. Melanopsin signals also reach brain regions responsible for form vision [7-9], but melanopsin's contribution, if any, to encoding visual images remains unclear. We addressed this deficit using principles of receptor silent substitution to present images in which visibility for melanopsin versus rods+cones was independently modulated, and we recorded evoked responses in the mouse dorsal lateral geniculate nucleus (dLGN; thalamic relay for cortical vision). Approximately 20% of dLGN units responded to patterns visible only to melanopsin, revealing that melanopsin signals alone can convey spatial information. Spatial receptive fields (RFs) mapped using melanopsin-isolating stimuli had ON centers with diameters ?13°. Melanopsin and rod+cone responses differed in the temporal domain, and responses to slow changes in radiance (<0.9 Hz) and stationary images were deficient when stimuli were rendered invisible for melanopsin. We employed these data to devise and test a mathematical model of melanopsin's involvement in form vision and applied it, along with further experimental recordings, to explore melanopsin signals under simulated active view of natural scenes. Our findings reveal that melanopsin enhances the thalamic representation of scenes containing local correlations in radiance, compensating for the high temporal frequency bias of cone vision and the negative correlation between magnitude and frequency for changes in direction of view. Together, these data reveal a distinct melanopsin contribution to encoding visual images, predicting that, under natural view, melanopsin augments the early visual system's ability to encode patterns over moderate spatial scales.
Project description:During interventional procedures, the deficiencies of nonstereoscopic vision increase the difficulty of identifying the anteroposterior direction and pathways of vessels. Therefore, achieving real-time stereoscopic vision during interventional procedures is meaningful. Pairs of X-ray images were captured with identical parameter settings, except for different rotation angles (represented as the ? angle). The resulting images at these ? angles were used as left-eye and right-eye views and were horizontally merged into single left-right 3D images. Virtual reality (VR) glasses were used for achieving stereo vision. Pairs of X-ray images from four angiographies with different ? angles (1.8-3.4°) were merged into left-right 3D images. Observation with VR glasses can produce realistic stereo views of vascular anatomical structure. The results showed that the optimal ? angles accepted by the brain for generating stereo vision were within a narrow range (approximately 1.4-4.1°). Subsequent tests showed that during transcatheter arterial chemoembolization, 3D X-ray stereoscopic images provided significantly improved spatial discrimination and convenience for identifying the supply vessels of a liver tumor and its anteroposterior direction compared with plain X-ray images (all P?<?0.01). Real-time X-ray stereoscopic vision can be easily achieved via the straightforward method described herein and has the potential to benefit patients during interventional procedures.
Project description:The organic and mineralogical heterogeneity in shale at micrometer and nanometer spatial scales contributes to the quality of gas reserves, gas flow mechanisms and gas production. Here, we demonstrate two molecular imaging approaches based on infrared spectroscopy to obtain mineral and kerogen information at these mesoscale spatial resolutions in large-sized shale rock samples. The first method is a modified microscopic attenuated total reflectance measurement that utilizes a large germanium hemisphere combined with a focal plane array detector to rapidly capture chemical images of shale rock surfaces spanning hundreds of micrometers with micrometer spatial resolution. The second method, synchrotron infrared nano-spectroscopy, utilizes a metallic atomic force microscope tip to obtain chemical images of micrometer dimensions but with nanometer spatial resolution. This chemically "deconvoluted" imaging at the nano-pore scale is then used to build a machine learning model to generate a molecular distribution map across scales with a spatial span of 1000 times, which enables high-throughput geochemical characterization in greater details across the nano-pore and micro-grain scales and allows us to identify co-localization of mineral phases with chemically distinct organics and even with gas phase sorbents. This characterization is fundamental to understand mineral and organic compositions affecting the behavior of shales.
Project description:It is often suggested that transitive inference (TI; if A > B and B > C, then A > C) involves mentally representing overlapping pairs of stimuli in a spatial series. However, there is little direct evidence to unequivocally determine the role of spatial representation in TI. We tested whether humans and rhesus monkeys use spatial representations in TI by training them to organize 7 images in a vertical spatial array. Then, we presented subjects with a TI task using these same images. The implied TI order was either congruent or incongruent with the order of the trained spatial array. Humans in the congruent condition learned premise pairs more quickly, and were faster and more accurate in critical probe tests, suggesting that the spatial arrangement of images learned during spatial training influenced subsequent TI performance. Monkeys first trained in the congruent condition also showed higher test trial accuracy when the spatial and inferred orders were congruent. These results directly support the hypothesis that humans solve TI problems by spatial organization, and suggest that this cognitive mechanism for inference may have ancient evolutionary roots.
Project description:Ever since the discovery of X-rays, tremendous efforts have been made to develop new imaging techniques for unlocking the hidden secrets of our world and enriching our understanding of it. X-ray differential phase contrast imaging, which measures the gradient of a sample's phase shift, can reveal more detail in a weakly absorbing sample than conventional absorption contrast. However, normally only the gradient's component in two mutually orthogonal directions is measurable. In this article, omnidirectional differential phase images, which record the gradient of phase shifts in all directions of the imaging plane, are efficiently generated by scanning an easily obtainable, randomly structured modulator along a spiral path. The retrieved amplitude and main orientation images for differential phase yield more information than the existing imaging methods. Importantly, the omnidirectional dark-field images can be simultaneously extracted to study strongly ordered scattering structures. The proposed method can open up new possibilities for studying a wide range of complicated samples composed of both heavy, strongly scattering atoms and light, weakly scattering atoms.
Project description:In this paper we tackle the problem of indoor robot localization by using a vision-based approach. Specifically, we propose a visual odometer able to give back the relative pose of an omnidirectional automatic guided vehicle (AGV) that moves inside an indoor industrial environment. A monocular downward-looking camera having the optical axis nearly perpendicular to the ground floor, is used for collecting floor images. After a preliminary analysis of images aimed at detecting robust point features (keypoints) takes place, specific descriptors associated to the keypoints enable to match the detected points to their consecutive frames. A robust correspondence feature filter based on statistical and geometrical information is devised for rejecting those incorrect matchings, thus delivering better pose estimations. A camera pose compensation is further introduced for ensuring better positioning accuracy. The effectiveness of proposed methodology has been proven through several experiments, in laboratory as well as in an industrial setting. Both quantitative and qualitative evaluations have been made. Outcomes have shown that the method provides a final positioning percentage error of 0.21% on an average distance of 17.2 m. A longer run in an industrial context has provided comparable results (a percentage error of 0.94% after about 80 m). The average relative positioning error is about 3%, which is still in good agreement with current state of the art.