Mutual Information-Based Disentangled Neural Networks for Classifying Unseen Categories in Different Domains: Application to Fetal Ultrasound Imaging.
ABSTRACT: Deep neural networks exhibit limited generalizability across images with different entangled domain features and categorical features. Learning generalizable features that can form universal categorical decision boundaries across domains is an interesting and difficult challenge. This problem occurs frequently in medical imaging applications when attempts are made to deploy and improve deep learning models across different image acquisition devices, across acquisition parameters or if some classes are unavailable in new training databases. To address this problem, we propose Mutual Information-based Disentangled Neural Networks (MIDNet), which extract generalizable categorical features to transfer knowledge to unseen categories in a target domain. The proposed MIDNet adopts a semi-supervised learning paradigm to alleviate the dependency on labeled data. This is important for real-world applications where data annotation is time-consuming, costly and requires training and expertise. We extensively evaluate the proposed method on fetal ultrasound datasets for two different image classification tasks where domain features are respectively defined by shadow artifacts and image acquisition devices. Experimental results show that the proposed method outperforms the state-of-the-art on the classification of unseen categories in a target domain with sparsely labeled training data.
Project description:Radiomics has shown promising results in several medical studies, yet it suffers from a limited discrimination and informative capability as well as a high variation and correlation with the tomographic scanner types, pixel spacing, acquisition protocol, and reconstruction parameters. We propose and compare two methods to transform quantitative image features in order to improve their stability across varying image acquisition parameters while preserving the texture discrimination abilities. In this way, variations in extracted features are representative of true physiopathological tissue changes in the scanned patients. A first approach is based on a two-layer neural network that can learn a nonlinear standardization transformation of various types of features including handcrafted and deep features. Second, domain adversarial training is explored to increase the invariance of the transformed features to the scanner of origin. The generalization of the proposed approach to unseen textures and unseen scanners is demonstrated by a set of experiments using a publicly available computed tomography texture phantom dataset scanned with various imaging devices and parameters.
Project description:BACKGROUND:Cell counting from cell cultures is required in multiple biological and biomedical research applications. Especially, accurate brightfield-based cell counting methods are needed for cell growth analysis. With deep learning, cells can be detected with high accuracy, but manually annotated training data is required. We propose a method for cell detection that requires annotated training data for one cell line only, and generalizes to other, unseen cell lines. RESULTS:Training a deep learning model with one cell line only can provide accurate detections for similar unseen cell lines (domains). However, if the new domain is very dissimilar from training domain, high precision but lower recall is achieved. Generalization capabilities of the model can be improved with training data transformations, but only to a certain degree. To further improve the detection accuracy of unseen domains, we propose iterative unsupervised domain adaptation method. Predictions of unseen cell lines with high precision enable automatic generation of training data, which is used to train the model together with parts of the previously used annotated training data. We used U-Net-based model, and three consecutive focal planes from brightfield image z-stacks. We trained the model initially with PC-3 cell line, and used LNCaP, BT-474 and 22Rv1 cell lines as target domains for domain adaptation. Highest improvement in accuracy was achieved for 22Rv1 cells. F1-score after supervised training was only 0.65, but after unsupervised domain adaptation we achieved a score of 0.84. Mean accuracy for target domains was 0.87, with mean improvement of 16 percent. CONCLUSIONS:With our method for generalized cell detection, we can train a model that accurately detects different cell lines from brightfield images. A new cell line can be introduced to the model without a single manual annotation, and after iterative domain adaptation the model is ready to detect these cells with high accuracy.
Project description:While learning is often highly specific to the exact stimuli and tasks used during training, there are cases where training results in learning that generalizes more broadly. It has been previously argued that the degree of specificity can be predicted based upon the learning solution(s) dictated by the particular demands of the training task. Here we applied this logic in the domain of rule-based categorization learning. Participants were presented with stimuli corresponding to four different categories and were asked to perform either a category discrimination task (which permits learning specific rule to discriminate two categories) or a category identification task (which does not permit learning a specific discrimination rule). In a subsequent transfer stage, all participants were asked to discriminate stimuli belonging to two of the categories which they had seen, but had never directly discriminated before (i.e., this particular discrimination was omitted from training). As predicted, learning in the category-discrimination tasks tended to be specific, while the category-identification task produced learning that transferred to the transfer discrimination task. These results suggest that the discrimination and identification tasks fostered the acquisition of different category representations which were more or less generalizable.
Project description:PURPOSE:Although deep learning has shown great promise for MR image reconstruction, an open question regarding the success of this approach is the robustness in the case of deviations between training and test data. The goal of this study is to assess the influence of image contrast, SNR, and image content on the generalization of learned image reconstruction, and to demonstrate the potential for transfer learning. METHODS:Reconstructions were trained from undersampled data using data sets with varying SNR, sampling pattern, image contrast, and synthetic data generated from a public image database. The performance of the trained reconstructions was evaluated on 10 in vivo patient knee MRI acquisitions from 2 different pulse sequences that were not used during training. Transfer learning was evaluated by fine-tuning baseline trainings from synthetic data with a small subset of in vivo MR training data. RESULTS:Deviations in SNR between training and testing led to substantial decreases in reconstruction image quality, whereas image contrast was less relevant. Trainings from heterogeneous training data generalized well toward the test data with a range of acquisition parameters. Trainings from synthetic, non-MR image data showed residual aliasing artifacts, which could be removed by transfer learning-inspired fine-tuning. CONCLUSION:This study presents insights into the generalization ability of learned image reconstruction with respect to deviations in the acquisition settings between training and testing. It also provides an outlook for the potential of transfer learning to fine-tune trainings to a particular target application using only a small number of training cases.
Project description:Plant leaf stomata are the gatekeepers of the atmosphere-plant interface and are essential building blocks of land surface models as they control transpiration and photosynthesis. Although more stomatal trait data are needed to significantly reduce the error in these model predictions, recording these traits is time-consuming, and no standardized protocol is currently available. Some attempts were made to automate stomatal detection from photomicrographs; however, these approaches have the disadvantage of using classic image processing or targeting a narrow taxonomic entity which makes these technologies less robust and generalizable to other plant species. We propose an easy-to-use and adaptable workflow from leaf to label. A methodology for automatic stomata detection was developed using deep neural networks according to the state of the art and its applicability demonstrated across the phylogeny of the angiosperms.We used a patch-based approach for training/tuning three different deep learning architectures. For training, we used 431 micrographs taken from leaf prints made according to the nail polish method from herbarium specimens of 19 species. The best-performing architecture was tested on 595 images of 16 additional species spread across the angiosperm phylogeny.The nail polish method was successfully applied in 78% of the species sampled here. The VGG19 architecture slightly outperformed the basic shallow and deep architectures, with a confidence threshold equal to 0.7 resulting in an optimal trade-off between precision and recall. Applying this threshold, the VGG19 architecture obtained an average F-score of 0.87, 0.89, and 0.67 on the training, validation, and unseen test set, respectively. The average accuracy was very high (94%) for computed stomatal counts on unseen images of species used for training.The leaf-to-label pipeline is an easy-to-use workflow for researchers of different areas of expertise interested in detecting stomata more efficiently. The described methodology was based on multiple species and well-established methods so that it can serve as a reference for future work.
Project description:Selecting particles from digital micrographs is an essential step in single-particle electron cryomicroscopy (cryo-EM). As manual selection of complete datasets-typically comprising thousands of particles-is a tedious and time-consuming process, numerous automatic particle pickers have been developed. However, non-ideal datasets pose a challenge to particle picking. Here we present the particle picking software crYOLO which is based on the deep-learning object detection system You Only Look Once (YOLO). After training the network with 200-2500 particles per dataset it automatically recognizes particles with high recall and precision while reaching a speed of up to five micrographs per second. Further, we present a general crYOLO network able to pick from previously unseen datasets, allowing for completely automated on-the-fly cryo-EM data preprocessing during data acquisition. crYOLO is available as a standalone program under http://sphire.mpg.de/ and is distributed as part of the image processing workflow in SPHIRE.
Project description:Machine learning is a promising approach to evaluate human movement based on wearable sensor data. A representative dataset for training data-driven models is crucial to ensure that the model generalizes well to unseen data. However, the acquisition of sufficient data is time-consuming and often infeasible. We present a method to create realistic inertial sensor data with corresponding biomechanical variables by 2D walking and running simulations. We augmented a measured inertial sensor dataset with simulated data for the training of convolutional neural networks to estimate sagittal plane joint angles, joint moments, and ground reaction forces (GRFs) of walking and running. When adding simulated data, the root mean square error (RMSE) of the test set of hip, knee, and ankle joint angles decreased up to 17%, 27% and 23%, the RMSE of knee and ankle joint moments up to 6% and the RMSE of anterior-posterior and vertical GRF up to 2 and 6%. Simulation-aided estimation of joint moments and GRFs was limited by inaccuracies of the biomechanical model. Improving the physics-based model and domain adaptation learning may further increase the benefit of simulated data. Future work can exploit biomechanical simulations to connect different data sources in order to create representative datasets of human movement. In conclusion, machine learning can benefit from available domain knowledge on biomechanical simulations to supplement cumbersome data collections.
Project description:Due to significant advances in sensor technology, studies towards activity recognition have gained interest and maturity in the last few years. Existing machine learning algorithms have demonstrated promising results by classifying activities whose instances have been already seen during training. Activity recognition methods based on real-life settings should cover a growing number of activities in various domains, whereby a significant part of instances will not be present in the training data set. However, to cover all possible activities in advance is a complex and expensive task. Concretely, we need a method that can extend the learning model to detect unseen activities without prior knowledge regarding sensor readings about those previously unseen activities. In this paper, we introduce an approach to leverage sensor data in discovering new unseen activities which were not present in the training set. We show that sensor readings can lead to promising results for zero-shot learning, whereby the necessary knowledge can be transferred from seen to unseen activities by using semantic similarity. The evaluation conducted on two data sets extracted from the well-known CASAS datasets show that the proposed zero-shot learning approach achieves a high performance in recognizing unseen (i.e., not present in the training dataset) new activities.
Project description:High-throughput microscopy of many single cells generates high-dimensional data that are far from straightforward to analyze. One important problem is automatically detecting the cellular compartment where a fluorescently-tagged protein resides, a task relatively simple for an experienced human, but difficult to automate on a computer. Here, we train an 11-layer neural network on data from mapping thousands of yeast proteins, achieving per cell localization classification accuracy of 91%, and per protein accuracy of 99% on held-out images. We confirm that low-level network features correspond to basic image characteristics, while deeper layers separate localization classes. Using this network as a feature calculator, we train standard classifiers that assign proteins to previously unseen compartments after observing only a small number of training examples. Our results are the most accurate subcellular localization classifications to date, and demonstrate the usefulness of deep learning for high-throughput microscopy.
Project description:Computational histomorphometric approaches typically use low-level image features for building machine learning classifiers. However, these approaches usually ignore high-level expert knowledge. A computational model (M_im) combines low-, mid-, and high-level image information to predict the likelihood of cancer in whole slide images. Handcrafted low- and mid-level features are computed from area, color, and spatial nuclei distributions. High-level information is implicitly captured from the recorded navigations of pathologists while exploring whole slide images during diagnostic tasks. This model was validated by predicting the presence of cancer in a set of unseen fields of view. The available database was composed of 24 cases of basal-cell carcinoma, from which 17 served to estimate the model parameters and the remaining 7 comprised the evaluation set. A total of 274 fields of view of size [Formula: see text] were extracted from the evaluation set. Then 176 patches from this set were used to train a support vector machine classifier to predict the presence of cancer on a patch-by-patch basis while the remaining 98 image patches were used for independent testing, ensuring that the training and test sets do not comprise patches from the same patient. A baseline model (M_ex) estimated the cancer likelihood for each of the image patches. M_ex uses the same visual features as M_im, but its weights are estimated from nuclei manually labeled as cancerous or noncancerous by a pathologist. M_im achieved an accuracy of 74.49% and an [Formula: see text]-measure of 80.31%, while M_ex yielded corresponding accuracy and F-measures of 73.47% and 77.97%, respectively.