Project description:Engineered RNA elements are programmable tools capable of detecting small molecules, proteins, and nucleic acids. Predicting the behavior of these tools remains a challenge, a situation that could be addressed through enhanced pattern recognition from deep learning. Thus, we investigate Deep Neural Networks (DNN) to predict toehold switch function as a canonical riboswitch model in synthetic biology. To facilitate DNN training, we synthesized and characterized in vivo a dataset of 91,534 toehold switches spanning 23 viral genomes and 906 human transcription factors. DNNs trained on nucleotide sequences outperformed (R2=0.43-0.70) previous state-of-the-art thermodynamic and kinetic models (R2=0.04-0.15) and allowed for human-understandable attention-visualizations (VIS4Map) to identify success and failure modes. This deep learning approach constitutes a major step forward in engineering and understanding of RNA synthetic biology.
Project description:Here we present miR-eCLIP analysis of AGO2 in HEK293 cells to address the small RNA repertoire and uncover their physiological targets. We developed an optimized bioinformatics approach of chimeric read identification to detect chimeras of high confidence, which were useed as an biologically validated input for miRBind, a deep learning method and web-server that can be used to accurately predict the potential of miRNA:target site binding.
Project description:Enlargement or aneurysm of the aorta predisposes to dissection, an important cause of sudden death. We trained a deep learning model to evaluate the dimensions of the ascending and descending thoracic aorta in 4.6 million cardiac magnetic resonance images from the UK Biobank. We then conducted genome-wide association studies in 39,688 individuals, revealing 82 loci associated with ascending and 47 with descending thoracic aortic diameter, of which 14 loci overlapped. Transcriptome-wide analyses, rare-variant burden tests, and human aortic single nucleus RNA sequencing prioritized genes including SVIL, which was strongly associated with descending aortic diameter. A polygenic score for ascending aortic diameter was associated with thoracic aortic aneurysm in 385,621 UK Biobank participants (HR = 1.43 per SD; CI 1.32-1.54; P = 3.3·10-20). Our results illustrate the potential for rapidly defining quantitative traits with deep learning, an approach that can be broadly applied to biomedical images.
Project description:Cells are the singular building blocks of life, and comprehensive understanding of morphology among other properties is crucial to assessment of underlying heterogeneity. We developed Computational Sorting and Mapping of Single Cells (COSMOS), a platform based on Artificial Intelligence (AI) and microfluidics to characterize and sort single cells based on real-time deep learning interpretation of high-resolution brightfield images. Supervised deep learning models were applied to characterize and sort cell lines and dissociated primary tissue based on high-dimensional embedding vectors of morphology without need for biomarker labels and stains/dyes. We demonstrate COSMOS capabilities with multiple human cell lines and tissue samples. These early results suggest that our neural networks embedding space can capture and recapitulate deep visual characteristics and can be used to efficiently purify unlabeled viable cells with desired morphological traits. Our approach resolves a technical gap in ability to perform real-time deep learning assessment and sorting of cells based on high-resolution brightfield images.
Project description:We report the DeepMS1, an approach combining deep learning-based retention time, ion mobility, detectability prediction, and linear regression-based scoring for MS1 feature identification.
Project description:Cells are the singular building blocks of life, and comprehensive understanding of morphology among other properties is crucial to assessment of underlying heterogeneity. We developed Computational Sorting and Mapping of Single Cells (COSMOS), a platform based on Artificial Intelligence (AI) and microfluidics to characterize and sort single cells based on real-time deep learning interpretation of high-resolution brightfield images. Supervised deep learning models were applied to characterize and sort cell lines and dissociated primary tissue based on high-dimensional embedding vectors of morphology without need for biomarker labels and stains/dyes. We demonstrate COSMOS capabilities with multiple human cell lines and tissue samples. These early results suggest that our neural networks embedding space can capture and recapitulate deep visual characteristics and can be used to efficiently purify unlabeled viable cells with desired morphological traits. Our approach resolves a technical gap in ability to perform real-time deep learning assessment and sorting of cells based on high-resolution brightfield images.
Project description:Cells are the singular building blocks of life, and comprehensive understanding of morphology among other properties is crucial to assessment of underlying heterogeneity. We developed Computational Sorting and Mapping of Single Cells (COSMOS), a platform based on Artificial Intelligence (AI) and microfluidics to characterize and sort single cells based on real-time deep learning interpretation of high-resolution brightfield images. Supervised deep learning models were applied to characterize and sort cell lines and dissociated primary tissue based on high-dimensional embedding vectors of morphology without need for biomarker labels and stains/dyes. We demonstrate COSMOS capabilities with multiple human cell lines and tissue samples. These early results suggest that our neural networks embedding space can capture and recapitulate deep visual characteristics and can be used to efficiently purify unlabeled viable cells with desired morphological traits. Our approach resolves a technical gap in ability to perform real-time deep learning assessment and sorting of cells based on high-resolution brightfield images.
Project description:A benchmark set of bottom-up proteomics data for training deep learning networks. It has data from 51 organisms and includes nearly 1 million peptides.