Project description:In recent years, maintenance work on public transport routes has drastically decreased in many countries due to difficult economic situations. The various studies that have been conducted by groups of drivers and groups related to road safety concluded that accidents are increasing due to the poor conditions of road surfaces, even affecting the condition of vehicles through costly breakdowns. Currently, the processes of detecting any type of damage to a road are carried out manually or are based on the use of a road vehicle, which incurs a high labor cost. To solve this problem, many research centers are investigating image processing techniques to identify poor-condition road areas using deep learning algorithms. The main objective of this work is to design of a distributed platform that allows the detection of damage to transport routes using drones and to provide the results of the most important classifiers. A case study is presented using a multi-agent system based on PANGEA that coordinates the different parts of the architecture using techniques based on ubiquitous computing. The results obtained by means of the customization of the You Only Look Once (YOLO) v4 classifier are promising, reaching an accuracy of more than 95%. The images used have been published in a dataset for use by the scientific community.
Project description:The term quality of life (QoL) refers to a wide range of multifaceted concepts that often involve subjective assessments of both positive and negative aspects of life. It is difficult to quantify QoL as the word has varied meanings in different academic areas and may have different connotations in different circumstances. The five sectors most commonly associated with QoL, however, are Health, Education, Environmental Quality, Personal Security, Civic Engagement, and Work-Life Balance. An emerging issue that falls under environmental quality is visual pollution (VP) which, as detailed in this study, refers to disruptive presences that limit visual ability in public roads with an emphasis on excavation barriers, potholes, and dilapidated sidewalks. Quantifying VP has always been difficult due to its subjective nature and lack of a consistent set of rules for systematic assessment of visual pollution. This emphasizes the need for research and module development that will allow government agencies to automatically predict and detect VP. Our dataset was collected from different regions in the Kingdom of Saudi Arabia (KSA) via the Ministry of Municipal and Rural Affairs and Housing (MOMRAH) as a part of a VP campaign to improve Saudi Arabia's urban landscape. It consists of 34,460 RGB images separated into three distinct classes: excavation barriers, potholes, and dilapidated sidewalks. To annotate all images for detection (i.e., bounding box) and classification (i.e., classification label) tasks, the deep active learning strategy (DAL) is used where an initial 1,200 VP images (i.e., 400 images per class) are manually annotated by four experts. Images with more than one object increase the number of training object ROIs which are recorded to be 8,417 for excavation barriers, 25,975 for potholes, and 7,412 for dilapidated sidewalks. The MOMRAH dataset is publicly published to enrich the research domain with the new VP image dataset.
Project description:BackgroundMore and more herbaria are digitising their collections. Images of specimens are made available online to facilitate access to them and allow extraction of information from them. Transcription of the data written on specimens is critical for general discoverability and enables incorporation into large aggregated research datasets. Different methods, such as crowdsourcing and artificial intelligence, are being developed to optimise transcription, but herbarium specimens pose difficulties in data extraction for many reasons.New informationTo provide developers of transcription methods with a means of optimisation, we have compiled a benchmark dataset of 1,800 herbarium specimen images with corresponding transcribed data. These images originate from nine different collections and include specimens that reflect the multiple potential obstacles that transcription methods may encounter, such as differences in language, text format (printed or handwritten), specimen age and nomenclatural type status. We are making these specimens available with a Creative Commons Zero licence waiver and with permanent online storage of the data. By doing this, we are minimising the obstacles to the use of these images for transcription training. This benchmark dataset of images may also be used where a defined and documented set of herbarium specimens is needed, such as for the extraction of morphological traits, handwriting recognition and colour analysis of specimens.
Project description:The detection of wheat heads in plant images is an important task for estimating pertinent wheat traits including head population density and head characteristics such as health, size, maturity stage, and the presence of awns. Several studies have developed methods for wheat head detection from high-resolution RGB imagery based on machine learning algorithms. However, these methods have generally been calibrated and validated on limited datasets. High variability in observational conditions, genotypic differences, development stages, and head orientation makes wheat head detection a challenge for computer vision. Further, possible blurring due to motion or wind and overlap between heads for dense populations make this task even more complex. Through a joint international collaborative effort, we have built a large, diverse, and well-labelled dataset of wheat images, called the Global Wheat Head Detection (GWHD) dataset. It contains 4700 high-resolution RGB images and 190000 labelled wheat heads collected from several countries around the world at different growth stages with a wide range of genotypes. Guidelines for image acquisition, associating minimum metadata to respect FAIR principles, and consistent head labelling methods are proposed when developing new head detection datasets. The GWHD dataset is publicly available at http://www.global-wheat.com/and aimed at developing and benchmarking methods for wheat head detection.
Project description:Accurate building extraction is crucial for urban understanding, but it often requires a substantial number of building samples. While some building datasets are available for model training, there remains a lack of high-quality building datasets covering urban and rural areas in China. To fill this gap, this study creates a high-resolution GaoFen-7 (GF-7) Building dataset utilizing the Chinese GF-7 imagery from six Chinese cities. The dataset comprises 5,175 pairs of 512 × 512 image tiles, covering 573.17 km2. It contains 170,015 buildings, with 84.8% of the buildings in urban areas and 15.2% in rural areas. The usability of the GF-7 Building dataset has been proved with seven convolutional neural networks, all achieving an overall accuracy (OA) exceeding 93%. Experiments have shown that the GF-7 building dataset can be used for building extraction in urban and rural scenarios. The proposed dataset boasts high quality and high diversity. It supplements existing building datasets and will contribute to promoting new algorithms for building extraction, as well as facilitating intelligent building interpretation in China.
Project description:A new hybrid vehicle detection scheme which integrates the Viola-Jones (V-J) and linear SVM classifier with HOG feature (HOG + SVM) methods is proposed for vehicle detection from low-altitude unmanned aerial vehicle (UAV) images. As both V-J and HOG + SVM are sensitive to on-road vehicles' in-plane rotation, the proposed scheme first adopts a roadway orientation adjustment method, which rotates each UAV image to align the roads with the horizontal direction so the original V-J or HOG + SVM method can be directly applied to achieve fast detection and high accuracy. To address the issue of descending detection speed for V-J and HOG + SVM, the proposed scheme further develops an adaptive switching strategy which sophistically integrates V-J and HOG + SVM methods based on their different descending trends of detection speed to improve detection efficiency. A comprehensive evaluation shows that the switching strategy, combined with the road orientation adjustment method, can significantly improve the efficiency and effectiveness of the vehicle detection from UAV images. The results also show that the proposed vehicle detection method is competitive compared with other existing vehicle detection methods. Furthermore, since the proposed vehicle detection method can be performed on videos captured from moving UAV platforms without the need of image registration or additional road database, it has great potentials of field applications. Future research will be focusing on expanding the current method for detecting other transportation modes such as buses, trucks, motors, bicycles, and pedestrians.
Project description:We present the HIT-UAV dataset, a high-altitude infrared thermal dataset for object detection applications on Unmanned Aerial Vehicles (UAVs). The dataset comprises 2,898 infrared thermal images extracted from 43,470 frames in hundreds of videos captured by UAVs in various scenarios, such as schools, parking lots, roads, and playgrounds. Moreover, the HIT-UAV provides essential flight data for each image, including flight altitude, camera perspective, date, and daylight intensity. For each image, we have manually annotated object instances with bounding boxes of two types (oriented and standard) to tackle the challenge of significant overlap of object instances in aerial images. To the best of our knowledge, the HIT-UAV is the first publicly available high-altitude UAV-based infrared thermal dataset for detecting persons and vehicles. We have trained and evaluated well-established object detection algorithms on the HIT-UAV. Our results demonstrate that the detection algorithms perform exceptionally well on the HIT-UAV compared to visual light datasets, since infrared thermal images do not contain significant irrelevant information about objects. We believe that the HIT-UAV will contribute to various UAV-based applications and researches. The dataset is freely available at https://pegasus.ac.cn .
Project description:The SUT-Crack dataset (Sharif University of Technology Crack Dataset) presents a collection of high-quality images depicting asphalt pavement cracks specifically designed to facilitate crack detection using various deep learning methods, including classification, object detection, segmentation, etc. During the dataset creation process, careful consideration was given to encompass all possible crack detection challenges, such as the presence of oil stains and shadows on the pavement surface along with varying lighting conditions. The dataset comprises 130 images designed specifically for segmentation and object detection tasks. Each image is accompanied by precise ground truth annotations. This dataset is well-suited for various crack detection methods, offering accurate annotations that enhance its reliability and usefulness across diverse applications. Moreover, the images were taken from a fixed height of 672 mm above the pavement surface, enabling straightforward calibration to derive real-world crack lengths from pixel measurements. A notable feature of the SUT-Crack dataset is the inclusion of geotags, affixing each image with precise latitude and longitude coordinates. This geotagging capability allows for the visualization of the images on a map and imparting valuable geographical context to the dataset. Additionally, by dividing the original images into 200×200 pixel images, over 25,000 images were produced and then categorized into "with crack" and "without crack" classes which can be used for classification purposes. SUT-Crack is available at https://doi.org/10.17632/gsbmknrhkv.6.
Project description:This paper presents a large publicly available multi-center lumbar spine magnetic resonance imaging (MRI) dataset with reference segmentations of vertebrae, intervertebral discs (IVDs), and spinal canal. The dataset includes 447 sagittal T1 and T2 MRI series from 218 patients with a history of low back pain and was collected from four different hospitals. An iterative data annotation approach was used by training a segmentation algorithm on a small part of the dataset, enabling semi-automatic segmentation of the remaining images. The algorithm provided an initial segmentation, which was subsequently reviewed, manually corrected, and added to the training data. We provide reference performance values for this baseline algorithm and nnU-Net, which performed comparably. Performance values were computed on a sequestered set of 39 studies with 97 series, which were additionally used to set up a continuous segmentation challenge that allows for a fair comparison of different segmentation algorithms. This study may encourage wider collaboration in the field of spine segmentation and improve the diagnostic value of lumbar spine MRI.