Project description:Investigation of whole genome expression pattern of 60 and 72 hours post fertilization Danio Rerio embryos exposed to TMT and vehicle control Embryos were exposed to 10uM TMT or control from 48hpf to 60 or 72 hpf. Three replicates were collected for each time point. 40 embryos were pooled to comprise a replicate.
Project description:Investigation of whole genome expression pattern of 60 and 72 hours post fertilization Danio Rerio embryos exposed to TMT and vehicle control
Project description:In this study, we present a hybrid deep neural network DeepPhospho which conceptually differs from all previous deep learning models for unmodified or modified peptide predictions in regard to peptide representation learning. Our approach utilizes a multi-module network and self attention mechanism to learn a highly expressive peptide representation, yielding more accurate predictions. When evaluated with multiple phosphoproteomics datasets acquired by DIA or DDA methods, DeepPhospho surpasses existing benchmarks and tools in the prediction of fragmentation patterns for phosphopeptides. In certain cases, the large variance between a DeepPhospho predicted MSMS spectrum and an experimentally assigned spectrum revealed the latter was a false identification while the predicted spectrum closely mimics the bona fide spectrum. Moreover, accurate prediction of chromatographic retention time for any phosphopeptide sequence is integrated into DeepPhospho, which allows for convenient construction of in silico spectral libraries to enhance DIA phosphoproteomics data mining.
Project description:In this study, a comprehensive spectral library for S. pneumoniae with an emphasis on phosphopeptide spectra was compiled. In total, 76% of the theoretical S. pneumoniae proteins were stored in a spectral library. Additionally, in combination with classical database search algorithms and the spectral library search, identification of 128 phosphoproteins was feasible. The phosphopeptides in the spectral library were manually validated using synthetic phosphopeptides to improve the quality of the library spectra and to minimize false positive spectra.
Project description:Spectral library search (SLS) is a major approach for peptide identification from tandem mass spectrometry data, offering a complementary approach to conventional database search. Moreover, with the emergence of spectrum prediction models, proteomics database search is progressively becoming more like spectral library search of predicted peptide spectra. The performance of peptide identification algorithms thus frequently depends on how well the underlying Spectrum-Spectrum Matching (SSM) scoring functions distinguish true and false positive matches. However, detailed comparative studies evaluating the performance of SSM scoring functions remain limited by the absence of comprehensive benchmark datasets. We propose new methods to build benchmarks that assess the effectiveness and robustness of SSM scoring functions. The resulting benchmark dataset is composed of (i) a set of 476,063 precursors used to construct 8 query spectrum sets with different levels of noise added to "ideal" and real experimental spectra, and (ii) three spectral libraries with different spectra for the same 3,065,819 precursors: experimental spectra, annotated/de-noised spectra and predicted spectra. The benchmark set was then used to evaluate 9 common spectrum preprocessing scenarios, followed by the evaluation of 3 standard SSM scoring functions, Cosine, Projected-Cosine (commonly used for the analysis of chimeric/mixture spectra), and Jensen-Shannon divergence, and 2 additional scoring functions used in state-of-the-art SLS tools: SpectraST and EntropyScore. The results revealed that scoring spectrum-spectrum matches is still an important open problem, with the best recall for typical SLS searches still assessed to be poor at just ~70% at the typical 1% error rate. Overall, SpectraST performed best for spectra with little-to-no noise, but JS-divergence performed better in some cases as it was found to be most resistant to noise. Conversely, the performance of Cosine and Entropy score was found to be generally lower than previously reported, with Projected-Cosine performing especially poorly in most cases. However, the performance of the SSM scoring functions was also found to depend quite significantly on the minimum number of matching peaks required for each SSM, with benchmark results showing that the scoring functions' performance and relative ranking can be very significantly affected by how this important parameter is set. The resulting benchmark dataset can be used to test and support the development of SSM scoring functions and the proposed benchmark construction approach, providing a foundation that can be extended for additional types of spectrum-spectrum matching.
Project description:Active protein translation can be assessed and measured using ribosome profiling sequencing strategies. Existing approaches make use of sequence fragment length or frame occupancy to differentiate between active translation and background noise, however they do not consider additional characteristics inherent to the technology which limits their overall accuracy. Here, we present an analytical tool that models the overall tri-nucleotide periodicity of ribosomal occupancy using a classifier based on spectral coherence. Our software, SPECtre, examines the relationship of normalized ribosome profiling read coverage over a rolling series of windows along a transcript against an idealized reference signal. A comparison of SPECtre against current methods on existing and new data shows a marked improvement in accuracy for detecting active translation and exhibits overall high sensitivity at a low false discovery rate.
Project description:Proteogenomics approaches often struggle with the distinction between right and false peptide-to-spectrum matches as the database size enlarges. However, features extracted from tandem mass spectrometry intensity predictors can enhance the peptide identification rate and can provide extra confidence for spectral matching in a proteogenomic context. To that end, features from the spectral intensity pattern predictors MS2PIP and Prosit were combined with the canonical scores from MaxQuant in the Percolator post-processing tool for protein databases constructed from RNA-seq and ribosome profiling analyses. The presented results provide evidence that this approach enhances the peptide identification power in a proteogenomic setting and in the meantime they lead to the validation of new proteoforms with elevated stringency. In this online repository, we submitted the conventional proteomic search results with MaxQuant against the custom nanopore RNA-seq-based search space. All other results can be found in the supplemental materials of the manuscript, in SRA (sequencing data) or under ProteomeXChange Project PXD011353 (as this is original data from a previuos paper).