Project description:Floods are natural disasters that repeatedly occur in Indonesia, causing substantial material losses and claiming many lives. Meanwhile, social media data has emerged as a valuable resource for analyzing user behaviour and interests, and its use for flood-related information is increasing. In this paper, we present a flood dataset collected from Instagram Reels, which consists of videos depicting flood events in Parepare. Every video was collected from different areas, time conditions and viewpoint, and converted into image form. The data set includes 7248 images. Images undergo preprocessing to ensure a clear depiction and differentiation of the flood event from the surrounding elements. Annotations given to each object, using a different color label, facilitate recognition and understanding of various computer vision applications. Overall, this flood dataset is a valuable resource for computer vision research, especially semantic segmentation method and promotes the development of algorithms for flood area identification and object recognition in flood-affected areas.
Project description:Purpose/aimThis paper provides a pedagogical example for systematic machine learning optimization in small dataset image segmentation, emphasizing hyperparameter selections. A simple process is presented for medical physicists to examine hyperparameter optimization. This is also applied to a case-study, demonstrating the benefit of the method.Materials and methodsAn unrestricted public Computed Tomography (CT) dataset, with binary organ segmentation, was used to develop a multiclass segmentation model. To start the optimization process, a preliminary manual search of hyperparameters was conducted and from there a grid search identified the most influential result metrics. A total of 658 different models were trained in 2100 h, using 13 160 effective patients. The quantity of results was analyzed using random forest regression, identifying relative hyperparameter impact.ResultsMetric implied segmentation quality (accuracy 96.8%, precision 95.1%) and visual inspection were found to be mismatched. In this work batch normalization was most important, but performance varied with hyperparameters and metrics selected. Targeted grid-search optimization and random forest analysis of relative hyperparameter importance, was an easily implementable sensitivity analysis approach.ConclusionThe proposed optimization method gives a systematic and quantitative approach to something intuitively understood, that hyperparameters change model performance. Even just grid search optimization with random forest analysis presented here can be informative within hardware and data quality/availability limitations, adding confidence to model validity and minimize decision-making risks. By providing a guided methodology, this work helps medical physicists to improve their model optimization, irrespective of specific challenges posed by datasets and model design.
Project description:Smart farming (SF) applications rely on robust and accurate computer vision systems. An important computer vision task in agriculture is semantic segmentation, which aims to classify each pixel of an image and can be used for selective weed removal. State-of-the-art implementations use convolutional neural networks (CNN) that are trained on large image datasets. In agriculture, publicly available RGB image datasets are scarce and often lack detailed ground-truth information. In contrast to agriculture, other research areas feature RGB-D datasets that combine color (RGB) with additional distance (D) information. Such results show that including distance as an additional modality can improve model performance further. Therefore, we introduce WE3DS as the first RGB-D image dataset for multi-class plant species semantic segmentation in crop farming. It contains 2568 RGB-D images (color image and distance map) and corresponding hand-annotated ground-truth masks. Images were taken under natural light conditions using an RGB-D sensor consisting of two RGB cameras in a stereo setup. Further, we provide a benchmark for RGB-D semantic segmentation on the WE3DS dataset and compare it with a solely RGB-based model. Our trained models achieve up to 70.7% mean Intersection over Union (mIoU) for discriminating between soil, seven crop species, and ten weed species. Finally, our work confirms the finding that additional distance information improves segmentation quality.
Project description:This paper presents a dataset of bird's eye chilies in a single farm for semantic segmentation. The dataset is generated using two cameras that are aligned left and right forming a stereo-vision video capture. By analyzing the disparity between corresponding points in the left and right images, algorithms can calculate the relative distance of objects in the scene. This depth information is useful in various applications, including 3D reconstruction, object tracking, and autonomous navigation. The dataset consists of 1150 left and right compressed images extracted from ten sets of stereo videos taken at ten different locations within the chili farm from the same ages of the bird's eye chilies. Since the dataset is used for semantic segmentation, the ground truth images of manually semantic segmented images are also provided in the dataset. The dataset can be used for 2D and 3D semantic segmentation of the bird's eye view chili farm. Some of the object classes in this dataset are the sky, living things, plantation, flat, construction, nature, and misc.
Project description:The purpose of the dataset is to provide annotated images for pixel classification tasks with application to powered wheelchair users. As some of the widely available datasets contain only general objects, we introduced this dataset to cover the missing pieces, which can be considered as application-specific objects. However, these objects of interest are not only important for powered wheelchair users but also for indoor navigation and environmental understanding in general. For example, indoor assistive and service robots need to comprehend their surroundings to ease navigation and interaction with different size objects. The proposed dataset is recorded using a camera installed on a powered wheelchair. The camera is installed beneath the joystick so that it can have a clear vision with no obstructions from the user's body or legs. The powered wheelchair is then driven through the corridors of the indoor environment, and a one-minute video is recorded. The collected video is annotated on the pixel level for semantic segmentation (pixel classification) tasks. Pixels of different objects are annotated using MATLAB software. The dataset has various object sizes (small, medium, and large), which can explain the variation of the pixel's distribution in the dataset. Usually, Deep Convolutional Neural Networks (DCNNs) that perform well on large-size objects fail to produce accurate results on small-size objects. Whereas training a DCNN on a multi-size objects dataset can build more robust systems. Although the recorded objects are vital for many applications, we have included more images of different kinds of door handles with different angles, orientations, and illuminations as they are rare in the publicly available datasets. The proposed dataset has 1549 images and covers nine different classes. We used the dataset to train and test a semantic segmentation system that can aid and guide visually impaired users by providing visual cues. The dataset is made publicly available at this link.
Project description:The development of compact and energy-efficient wearable sensors has led to an increase in the availability of biosignals. To effectively and efficiently analyze continuously recorded and multidimensional time series at scale, the ability to perform meaningful unsupervised data segmentation is an auspicious target. A common way to achieve this is to identify change-points within the time series as the segmentation basis. However, traditional change-point detection algorithms often come with drawbacks, limiting their real-world applicability. Notably, they generally rely on the complete time series to be available and thus cannot be used for real-time applications. Another common limitation is that they poorly (or cannot) handle the segmentation of multidimensional time series. Consequently, the main contribution of this work is to propose a novel unsupervised segmentation algorithm for multidimensional time series named Latent Space Unsupervised Semantic Segmentation (LS-USS), which was designed to easily work with both online and batch data. Latent Space Unsupervised Semantic Segmentation addresses the challenge of multivariate change-point detection by utilizing an autoencoder to learn a 1-dimensional latent space on which change-point detection is then performed. To address the challenge of real-time time series segmentation, this work introduces the Local Threshold Extraction Algorithm (LTEA) and a "batch collapse" algorithm. The "batch collapse" algorithm enables Latent Space Unsupervised Semantic Segmentation to process streaming data by dividing it into manageable batches, while Local Threshold Extraction Algorithm is employed to detect change-points in the time series whenever the computed metric by Latent Space Unsupervised Semantic Segmentation exceeds a predefined threshold. By using these algorithms in combination, our approach is able to accurately segment time series data in real-time, making it well-suited for applications where timely detection of changes is critical. When evaluating Latent Space Unsupervised Semantic Segmentation on a variety of real-world datasets the Latent Space Unsupervised Semantic Segmentation systematically achieves equal or better performance than other state-of-the-art change-point detection algorithms it is compared to in both offline and real-time settings.
Project description:Angiogenesis is the development of new blood vessels from pre-existing ones. It is a complex multifaceted process that is essential for the adequate functioning of human organisms. The investigation of angiogenesis is conducted using various methods. One of the most popular and most serviceable of these methods in vitro is the short-term culture of endothelial cells on Matrigel. However, a significant disadvantage of this method is the manual analysis of a large number of microphotographs. In this regard, it is necessary to develop a technique for automating the annotation of images of capillary-like structures. Despite the increasing use of deep learning in biomedical image analysis, as far as we know, there still has not been a study on the application of this method to angiogenesis images. To the best of our knowledge, this article demonstrates the first tool based on a convolutional Unet++ encoder-decoder architecture for the semantic segmentation of in vitro angiogenesis simulation images followed by the resulting mask postprocessing for data analysis by experts. The first annotated dataset in this field, AngioCells, is also being made publicly available. To create this dataset, participants were recruited into a markup group, an annotation protocol was developed, and an interparticipant agreement study was carried out.
Project description:The Semantic Coherence Dataset has been designed to experiment with semantic coherence metrics. More specifically, the dataset has been built to the ends of testing whether probabilistic measures, such as perplexity, provide stable scores to analyze spoken language. Perplexity, which was originally conceived as an information-theoretic measure to assess the probabilistic inference properties of language models, has recently been proven to be an appropriate tool to categorize speech transcripts based on semantic coherence accounts. More specifically, perplexity has been successfully employed to discriminate subjects suffering from Alzheimer Disease and healthy controls. Collected data include speech transcripts, intended to investigate semantic coherence at different levels: data are thus arranged into two classes, to investigate intra-subject semantic coherence, and inter-subject semantic coherence. In the former case transcripts from a single speaker can be employed to train and test language models and to explore whether the perplexity metric provides stable scores in assessing talks from that speaker, while allowing to distinguish between two different forms of speech, political rallies and interviews. In the latter case, models can be trained by employing transcripts from a given speaker, and then used to measure how stable the perplexity metric is when computed using the model from that user and transcripts from different users. Transcripts were extracted from talks lasting almost 13 hours (overall 12:45:17 and 120,326 tokens) for the former class; and almost 30 hours (29:47:34 and 252,270 tokens) for the latter one. Data herein can be reused to perform analyses on measures built on top of language models, and more in general on measures that are aimed at exploring the linguistic features of text documents.
Project description:We present an algorithm for the per-voxel semantic segmentation of a three-dimensional volume. At the core of our algorithm is a novel "pyramid context" feature, a descriptive representation designed such that exact per-voxel linear classification can be made extremely efficient. This feature not only allows for efficient semantic segmentation but enables other aspects of our algorithm, such as novel learned features and a stacked architecture that can reason about self-consistency. We demonstrate our technique on 3D fluorescence microscopy data of Drosophila embryos for which we are able to produce extremely accurate semantic segmentations in a matter of minutes, and for which other algorithms fail due to the size and high-dimensionality of the data, or due to the difficulty of the task.
Project description:A growing body of evidence suggests that gene flow between closely related species is a widespread phenomenon. Alleles that introgress from one species into a close relative are typically neutral or deleterious, but sometimes confer a significant fitness advantage. Given the potential relevance to speciation and adaptation, numerous methods have therefore been devised to identify regions of the genome that have experienced introgression. Recently, supervised machine learning approaches have been shown to be highly effective for detecting introgression. One especially promising approach is to treat population genetic inference as an image classification problem, and feed an image representation of a population genetic alignment as input to a deep neural network that distinguishes among evolutionary models (i.e. introgression or no introgression). However, if we wish to investigate the full extent and fitness effects of introgression, merely identifying genomic regions in a population genetic alignment that harbor introgressed loci is insufficient-ideally we would be able to infer precisely which individuals have introgressed material and at which positions in the genome. Here we adapt a deep learning algorithm for semantic segmentation, the task of correctly identifying the type of object to which each individual pixel in an image belongs, to the task of identifying introgressed alleles. Our trained neural network is thus able to infer, for each individual in a two-population alignment, which of those individual's alleles were introgressed from the other population. We use simulated data to show that this approach is highly accurate, and that it can be readily extended to identify alleles that are introgressed from an unsampled "ghost" population, performing comparably to a supervised learning method tailored specifically to that task. Finally, we apply this method to data from Drosophila, showing that it is able to accurately recover introgressed haplotypes from real data. This analysis reveals that introgressed alleles are typically confined to lower frequencies within genic regions, suggestive of purifying selection, but are found at much higher frequencies in a region previously shown to be affected by adaptive introgression. Our method's success in recovering introgressed haplotypes in challenging real-world scenarios underscores the utility of deep learning approaches for making richer evolutionary inferences from genomic data.