Project description:Alternative splicing is widely acknowledged to be a crucial regulator of gene expression and is a key contributor to both normal developmental processes and disease states. While cost-effective and accurate for quantification, short-read RNA-seq lacks the ability to resolve full-length transcript isoforms despite increasingly sophisticated computational methods. Long-read sequencing platforms such as Pacific Biosciences (PacBio) and Oxford Nanopore (ONT) bypass the transcript reconstruction challenges of short-reads. Here we describe TALON, the ENCODE4 pipeline for analyzing PacBio cDNA and ONT direct-RNA transcriptomes. We apply TALON to three human ENCODE Tier 1 cell lines and show that while both technologies perform well at full-transcript discovery and quantification, each technology has its distinct artifacts. We further apply TALON to mouse cortical and hippocampal transcriptomes and find that a substantial proportion of neuronal genes have more reads associated with novel isoforms than annotated ones. The TALON pipeline for technology-agnostic, long-read transcriptome discovery and quantification tracks both known and novel transcript models as well as expression levels across datasets for both simple studies and larger projects such as ENCODE that seek to decode transcriptional regulation in the human and mouse genomes to predict more accurate expression levels of genes and transcripts than possible with short-reads alone.
Project description:Accompanying benchmarking sample for "TaxIt: An iterative computational pipeline for untargeted strain-level identification using MS/MS spectra from pathogenic single-organism samples": Untargeted accurate strain-level classification of a priori unidentified organisms using tandem mass spectrometry is a challenging task. Reference databases often lack taxonomic depth, limiting peptide assignments to the species level. However, the extension with detailed strain information increases runtime and decreases statistical power. In addition, larger databases contain a higher number of similar proteomes. We present TaxIt, an iterative workflow to address the increasing search space required for MS/MS-based strain-level classification of samples with unknown taxonomic origin. TaxIt first applies reference sequence data for initial identification of species candidates, followed by automated acquisition of relevant strain sequences for low level classification. Furthermore, proteome similarities resulting in ambiguous taxonomic assignments are addressed with an abundance weighting strategy to increase the confidence in candidate taxa. For benchmarking the performance of our method, we apply our iterative workflow on several samples of bacterial and viral origin. In comparison to non-iterative approaches using unique peptides or advanced abundance correction, TaxIt identifies microbial strains correctly in all examples presented (with one tie), thereby demonstrating the potential for untargeted and deeper taxonomic classification. TaxIt makes extensive use of public, unrestricted and continuously growing sequence resources such as the NCBI databases and is available under open-source BSD license at https://gitlab.com/rki_bioinformatics/TaxIt.
Project description:Cell states are regulated by extrinsic signals from various external factors such as intercellular interactions, and intrinsic gene expression. Although comprehensive cell state profiling has been attempted, it remains simultaneous analysis of signal activation has still been challenging. Multiplexed imaging is a technique acquiring multiple protein information at a single cell level as traditional immunofluorescence. However, the method often compromises resolution, hindering the analysis of intracellular localization dynamics and post-translational modifications of proteins. To address these limitations, we developed an erasable fluorescence method using disulfide linkers to label antibodies. We term these antibodies ‘Precise Emission Canceling Antibodies (PECAbs)’. PECAb allows for high-resolution iterative imaging with minimal non-specific binding. Automation enables our system to achieve reproducible quantitative analysis using 206 antibodies. The resulting quantitative data allow reconstruction of the spatiotemporal dynamics of signaling pathways over both long and short timescales. Additionally, combining this approach with sequential RNA-FISH can effectively classify cells and identify their signal activation states in human tissue. Overall, the PECAb system serves as a comprehensive platform for analyzing complex cell processes, from signal transduction to gene expression.
Project description:Cell states are regulated by extrinsic signals from various external factors such as intercellular interactions, and intrinsic gene expression. Although comprehensive cell state profiling has been attempted, it remains simultaneous analysis of signal activation has still been challenging. Multiplexed imaging is a technique acquiring multiple protein information at a single cell level as traditional immunofluorescence. However, the method often compromises resolution, hindering the analysis of intracellular localization dynamics and post-translational modifications of proteins. To address these limitations, we developed an erasable fluorescence method using disulfide linkers to label antibodies. We term these antibodies ‘Precise Emission Canceling Antibodies (PECAbs)’. PECAb allows for high-resolution iterative imaging with minimal non-specific binding. Automation enables our system to achieve reproducible quantitative analysis using 206 antibodies. The resulting quantitative data allow reconstruction of the spatiotemporal dynamics of signaling pathways over both long and short timescales. Additionally, combining this approach with sequential RNA-FISH can effectively classify cells and identify their signal activation states in human tissue. Overall, the PECAb system serves as a comprehensive platform for analyzing complex cell processes, from signal transduction to gene expression.