SPHIRE-crYOLO is a fast and accurate fully automated particle picker for cryo-EM.
ABSTRACT: Selecting particles from digital micrographs is an essential step in single-particle electron cryomicroscopy (cryo-EM). As manual selection of complete datasets-typically comprising thousands of particles-is a tedious and time-consuming process, numerous automatic particle pickers have been developed. However, non-ideal datasets pose a challenge to particle picking. Here we present the particle picking software crYOLO which is based on the deep-learning object detection system You Only Look Once (YOLO). After training the network with 200-2500 particles per dataset it automatically recognizes particles with high recall and precision while reaching a speed of up to five micrographs per second. Further, we present a general crYOLO network able to pick from previously unseen datasets, allowing for completely automated on-the-fly cryo-EM data preprocessing during data acquisition. crYOLO is available as a standalone program under http://sphire.mpg.de/ and is distributed as part of the image processing workflow in SPHIRE.
Project description:Particle picking is a crucial first step in the computational pipeline of single-particle cryo-electron microscopy (cryo-EM). Selecting particles from the micrographs is difficult especially for small particles with low contrast. As high-resolution reconstruction typically requires hundreds of thousands of particles, manually picking that many particles is often too time-consuming. While template-based particle picking is currently a popular approach, it may suffer from introducing manual bias into the selection process. In addition, this approach is still somewhat time-consuming. This paper presents the APPLE (Automatic Particle Picking with Low user Effort) picker, a simple and novel approach for fast, accurate, and template-free particle picking. This approach is evaluated on publicly available datasets containing micrographs of ?-galactosidase, T20S proteasome, 70S ribosome and keyhole limpet hemocyanin projections.
Project description:Single-particle cryo-electron microscopy (cryo-EM) has become a mainstream tool for the structural determination of biological macromolecular complexes. However, high-resolution cryo-EM reconstruction often requires hundreds of thousands of single-particle images. Particle extraction from experimental micrographs thus can be laborious and presents a major practical bottleneck in cryo-EM structural determination. Existing computational methods for particle picking often use low-resolution templates for particle matching, making them susceptible to reference-dependent bias. It is critical to develop a highly efficient template-free method for the automatic recognition of particle images from cryo-EM micrographs.We developed a deep learning-based algorithmic framework, DeepEM, for single-particle recognition from noisy cryo-EM micrographs, enabling automated particle picking, selection and verification in an integrated fashion. The kernel of DeepEM is built upon a convolutional neural network (CNN) composed of eight layers, which can be recursively trained to be highly "knowledgeable". Our approach exhibits an improved performance and accuracy when tested on the standard KLH dataset. Application of DeepEM to several challenging experimental cryo-EM datasets demonstrated its ability to avoid the selection of un-wanted particles and non-particles even when true particles contain fewer features.The DeepEM methodology, derived from a deep CNN, allows automated particle extraction from raw cryo-EM micrographs in the absence of a template. It demonstrates an improved performance, objectivity and accuracy. Application of this novel method is expected to free the labor involved in single-particle verification, significantly improving the efficiency of cryo-EM data processing.
Project description:Single-particle cryo-electron microscopy (cryo-EM) has recently become a mainstream technique for the structural determination of macromolecules. Typical cryo-EM workflows collect hundreds of thousands of single-particle projections from thousands of micrographs using particle-picking algorithms. However, the number of false positives selected by these algorithms is large, so that a number of different 'cleaning steps' are necessary to decrease the false-positive ratio. Most commonly employed techniques for the pruning of false-positive particles are time-consuming and require user intervention. In order to overcome these limitations, a deep learning-based algorithm named Deep Consensus is presented in this work. Deep Consensus works by computing a smart consensus over the output of different particle-picking algorithms, resulting in a set of particles with a lower false-positive ratio than the initial set obtained by the pickers. Deep Consensus is based on a deep convolutional neural network that is trained on a semi-automatically generated data set. The performance of Deep Consensus has been assessed on two well known experimental data sets, virtually eliminating user intervention for pruning, and enhances the reproducibility and objectivity of the whole process while achieving precision and recall figures above 90%.
Project description:BACKGROUND:An important task of macromolecular structure determination by cryo-electron microscopy (cryo-EM) is the identification of single particles in micrographs (particle picking). Due to the necessity of human involvement in the process, current particle picking techniques are time consuming and often result in many false positives and negatives. Adjusting the parameters to eliminate false positives often excludes true particles in certain orientations. The supervised machine learning (e.g. deep learning) methods for particle picking often need a large training dataset, which requires extensive manual annotation. Other reference-dependent methods rely on low-resolution templates for particle detection, matching and picking, and therefore, are not fully automated. These issues motivate us to develop a fully automated, unbiased framework for particle picking. RESULTS:We design a fully automated, unsupervised approach for single particle picking in cryo-EM micrographs. Our approach consists of three stages: image preprocessing, particle clustering, and particle picking. The image preprocessing is based on multiple techniques including: image averaging, normalization, cryo-EM image contrast enhancement correction (CEC), histogram equalization, restoration, adaptive histogram equalization, guided image filtering, and morphological operations. Image preprocessing significantly improves the quality of original cryo-EM images. Our particle clustering method is based on an intensity distribution model which is much faster and more accurate than traditional K-means and Fuzzy C-Means (FCM) algorithms for single particle clustering. Our particle picking method, based on image cleaning and shape detection with a modified Circular Hough Transform algorithm, effectively detects the shape and the center of each particle and creates a bounding box encapsulating the particles. CONCLUSIONS:AutoCryoPicker can automatically and effectively recognize particle-like objects from noisy cryo-EM micrographs without the need of labeled training data or human intervention making it a useful tool for cryo-EM protein structure determination.
Project description:Extraction of particles from cryo-electron microscopy (cryo-EM) micrographs is a crucial step in processing single-particle datasets. Although algorithms have been developed for automatic particle picking, these algorithms generally rely on two-dimensional templates for particle identification, which may exhibit biases that can propagate artifacts through the reconstruction pipeline. Manual picking is viewed as a gold-standard solution for particle selection, but it is too time-consuming to perform on data sets of thousands of images. In recent years, crowdsourcing has proven effective at leveraging the open web to manually curate datasets. In particular, citizen science projects such as Galaxy Zoo have shown the power of appealing to users' scientific interests to process enormous amounts of data. To this end, we explored the possible applications of crowdsourcing in cryo-EM particle picking, presenting a variety of novel experiments including the production of a fully annotated particle set from untrained citizen scientists. We show the possibilities and limitations of crowdsourcing particle selection tasks, and explore further options for crowdsourcing cryo-EM data processing.
Project description:BACKGROUND:The detection of weak signals and selection of single particles from low-contrast micrographs of frozen hydrated biomolecules by cryo-electron microscopy (cryo-EM) represents a major practical bottleneck in cryo-EM data analysis. Template-based particle picking by an objective function using fast local correlation (FLC) allows computational extraction of a large number of candidate particles from micrographs. Another independent objective function based on maximum likelihood estimates (MLE) can be used to align the images and verify the presence of a signal in the selected particles. Despite the widespread applications of the two objective functions, an optimal combination of their utilities has not been exploited. Here we propose a bi-objective function (BOF) approach that combines both FLC and MLE and explore the potential advantages and limitations of BOF in signal detection from cryo-EM data. RESULTS:The robustness of the BOF strategy in particle selection and verification was systematically examined with both simulated and experimental cryo-EM data. We investigated how the performance of the BOF approach is quantitatively affected by the signal-to-noise ratio (SNR) of cryo-EM data and by the choice of initialization for FLC and MLE. We quantitatively pinpointed the critical SNR (~ 0.005), at which the BOF approach starts losing its ability to select and verify particles reliably. We found that the use of a Gaussian model to initialize the MLE suppresses the adverse effects of reference dependency in the FLC function used for template-matching. CONCLUSION:The BOF approach, which combines two distinct objective functions, provides a sensitive way to verify particles for downstream cryo-EM structure analysis. Importantly, reference dependency of the FLC does not necessarily transfer to the MLE, enabling the robust detection of weak signals. Our insights into the numerical behavior of the BOF approach can be used to improve automation efficiency in the cryo-EM data processing pipeline for high-resolution structural determination.
Project description:High-resolution single-particle cryo-EM data analysis relies on accurate particle picking. To facilitate the particle picking process, a self-supervised workflow has been developed. This includes an iterative strategy, which uses a 2D class average to improve training particles, and a progressively improved convolutional neural network for particle picking. To automate the selection of particles, a threshold is defined (%/Res) using the ratio of percentage class distribution and resolution as a cutoff. This workflow has been tested using six publicly available data sets with different particle sizes and shapes, and can automatically pick particles with minimal user input. The picked particles support high-resolution reconstructions at 3.0?Å or better. This workflow is a step towards automated single-particle cryo-EM data analysis at the stage of particle picking. It may be used in conjunction with commonly used single-particle analysis packages such as Relion, cryoSPARC, cisTEM, SPHIRE and EMAN2.
Project description:BACKGROUND:Cryo-electron microscopy (cryo-EM) has become a widely used tool for determining the structures of proteins and macromolecular complexes. To acquire the input for single-particle cryo-EM reconstruction, researchers must select hundreds of thousands of particles from micrographs. As the signal-to-noise ratio (SNR) of micrographs is extremely low, the performance of automated particle-selection methods is still unable to meet research requirements. To free researchers from this laborious work and to acquire a large number of high-quality particles, we propose an automated particle-selection method (PIXER) based on the idea of segmentation using a deep neural network. RESULTS:First, to accommodate low-SNR conditions, we convert micrographs into probability density maps using a segmentation network. These probability density maps indicate the likelihood that each pixel of a micrograph is part of a particle instead of just background noise. Particles selected from density maps have a more robust signal than do those directly selected from the original noisy micrographs. Second, at present, there is no segmentation-training dataset for cryo-EM. To enable our plan, we present an automated method to generate a training dataset for segmentation using real-world data. Third, we propose a grid-based, local-maximum method to locate the particles from the probability density maps. We tested our method on simulated and real-world experimental datasets and compared PIXER with the mainstream methods RELION, DeepEM and DeepPicker to demonstrate its performance. The results indicate that, as a fully automated method, PIXER can acquire results as good as the semi-automated methods RELION and DeepEM. CONCLUSION:To our knowledge, our work is the first to address the particle-selection problem using the segmentation network concept. As a fully automated particle-selection method, PIXER can free researchers from laborious particle-selection work. Based on the results of experiments, PIXER can acquire accurate results under low-SNR conditions within minutes.
Project description:Single particle cryo-EM requires full automation to allow high-throughput structure determination. Although software packages exist where parts of the cryo-EM pipeline are automated, a complete solution that offers reliable on-the-fly processing, resulting in high-resolution structures, does not exist. Here we present TranSPHIRE: A software package for fully-automated processing of cryo-EM datasets during data acquisition. TranSPHIRE transfers data from the microscope, automatically applies the common pre-processing steps, picks particles, performs 2D clustering, and 3D refinement parallel to image recording. Importantly, TranSPHIRE introduces a machine learning-based feedback loop to re-train its picking model to adapt to any given data set live during processing. This elegant approach enables TranSPHIRE to process data more effectively, producing high-quality particle stacks. TranSPHIRE collects and displays all metrics and microscope settings to allow users to quickly evaluate data during acquisition. TranSPHIRE can run on a single work station and also includes the automated processing of filaments.
Project description:The acquisition of cryo-electron microscopy (cryo-EM) data from biological specimens must be tightly coupled to data preprocessing to ensure the best data quality and microscope usage. Here we describe Warp, a software that automates all preprocessing steps of cryo-EM data acquisition and enables real-time evaluation. Warp corrects micrographs for global and local motion, estimates the local defocus and monitors key parameters for each recorded micrograph or tomographic tilt series in real time. The software further includes deep-learning-based models for accurate particle picking and image denoising. The output from Warp can be fed into established programs for particle classification and 3D-map refinement. Our benchmarks show improvement in the nominal resolution, which went from 3.9?Å to 3.2?Å, of a published cryo-EM data set for influenza virus hemagglutinin. Warp is easy to install from http://github.com/cramerlab/warp and computationally inexpensive, and has an intuitive, streamlined user interface.