Proteomics

Dataset Information

0

Combining de novo peptide sequencing algorithms, a synergistic approach to boost both identifications and confidence in bottom-up proteomics


ABSTRACT: Complex MS-based proteomics datasets are usually analyzed by protein database-searches. While this approach performs considerably well for sequenced organisms, direct inference of peptide sequences from tandem mass spectra, i.e. de novo peptide sequencing, oftentimes is the only way to obtain information when protein databases are absent. However, available algorithms suffer from drawbacks such as lack of validation and often high rates of false positive hits (FP). Here we present a simple method of combining results from commonly available de novo peptide sequencing algorithms, which in conjunction with minor tweaks in data acquisition ensues lower empirical FDR compared to the analysis using single algorithms. Results were validated using state-of-the art database search algorithms as well specifically synthesized reference peptides. Thus, we could increase the number of PSMs meeting a stringent FDR of 5% more than threefold compared to the single best de novo sequencing algorithm alone, accounting for an average of 11,120 PSMs (combined) instead of 3,476 PSMs (alone) in triplicate 2 h LC-MS runs of tryptic HeLa digestion.

INSTRUMENT(S): Q Exactive

ORGANISM(S): Homo Sapiens (human) Saccharomyces Cerevisiae (baker's Yeast) Radix Auricularia Mus Musculus (mouse)

TISSUE(S): Whole Body, Cell Culture, Hela Cell

SUBMITTER: Bernhard Blank-Landeshammer  

LAB HEAD: Prof. Dr. Albert Sickmann

PROVIDER: PXD005280 | Pride | 2019-09-25

REPOSITORIES: Pride

altmetric image

Publications

Combining De Novo Peptide Sequencing Algorithms, A Synergistic Approach to Boost Both Identifications and Confidence in Bottom-up Proteomics.

Blank-Landeshammer Bernhard B   Kollipara Laxmikanth L   Biß Karsten K   Pfenninger Markus M   Malchow Sebastian S   Shuvaev Konstantin K   Zahedi René P RP   Sickmann Albert A  

Journal of proteome research 20170822 9


Complex mass spectrometry based proteomics data sets are mostly analyzed by protein database searches. While this approach performs considerably well for sequenced organisms, direct inference of peptide sequences from tandem mass spectra, i.e., de novo peptide sequencing, oftentimes is the only way to obtain information when protein databases are absent. However, available algorithms suffer from drawbacks such as lack of validation and often high rates of false positive hits (FP). Here we presen  ...[more]

Similar Datasets

2016-01-12 | PXD003317 | Pride
2019-01-11 | PXD008688 | Pride
2019-01-11 | PXD008690 | Pride
2019-01-11 | PXD011562 | Pride
2022-02-15 | PXD030203 | Pride
2012-06-01 | E-GEOD-22079 | biostudies-arrayexpress
2013-12-04 | E-GEOD-45569 | biostudies-arrayexpress
2018-11-25 | E-MTAB-7351 | biostudies-arrayexpress
2021-07-16 | PXD022784 | Pride
2015-05-12 | E-GEOD-65891 | biostudies-arrayexpress