Proteomics

Dataset Information

0

Application of de novo sequencing to large-scale complex proteomics datasets


ABSTRACT: Dependent on concise, pre-defined protein sequence databases, traditional search algorithms perform poorly when analyzing mass spectra derived from wholly uncharacterized protein products. Conversely, de novo peptide sequencing algorithms can interpret mass spectra without relying on reference databases. However, such algorithms have been difficult to apply to complex protein mixtures, in part due to a lack of methods for automatically validating de novo sequencing results. Here, we present novel metrics for benchmarking de novo sequencing algorithm performance on large scale proteomics datasets, and present a method for accurately calibrating false discovery rates on de novo results. We also present a novel algorithm (LADS) which leverages experimentally disambiguated fragmentation spectra to boost sequencing accuracy and sensitivity. LADS improves sequencing accuracy on longer peptides relative to other algorithms and improves discriminability of correct and incorrect sequences. Using these advancements, we demonstrate accurate de novo identification of peptide sequences not identifiable using database search-based approaches.

INSTRUMENT(S): LTQ Orbitrap Velos

ORGANISM(S): Bos Taurus (bovine) Saccharomyces Cerevisiae (baker's Yeast)

TISSUE(S): Brain

SUBMITTER: Arun Devabhaktuni  

LAB HEAD: Joshua E. Elias

PROVIDER: PXD003317 | Pride | 2016-01-12

REPOSITORIES: Pride

altmetric image

Publications

Application of de Novo Sequencing to Large-Scale Complex Proteomics Data Sets.

Devabhaktuni Arun A   Elias Joshua E JE  

Journal of proteome research 20160125 3


Dependent on concise, predefined protein sequence databases, traditional search algorithms perform poorly when analyzing mass spectra derived from wholly uncharacterized protein products. Conversely, de novo peptide sequencing algorithms can interpret mass spectra without relying on reference databases. However, such algorithms have been difficult to apply to complex protein mixtures, in part due to a lack of methods for automatically validating de novo sequencing results. Here, we present novel  ...[more]

Similar Datasets

2019-09-25 | PXD005280 | Pride
2012-06-01 | E-GEOD-22079 | biostudies-arrayexpress
2018-11-25 | E-MTAB-7351 | biostudies-arrayexpress
2017-04-03 | PXD003804 | Pride
2019-01-11 | PXD008688 | Pride
2016-07-01 | E-GEOD-81949 | biostudies-arrayexpress
2016-04-01 | E-GEOD-78893 | biostudies-arrayexpress
2013-03-31 | E-GEOD-40446 | biostudies-arrayexpress
2021-05-13 | E-MTAB-8893 | biostudies-arrayexpress
2019-01-11 | PXD008690 | Pride