Proteomics

Dataset Information

0

QuiXoT+ project: General Statistical framework for quantitative proteomics by stable isotope labeling


ABSTRACT: Protein identification was carried out by searching against a joint Human + Yeast Swissprot database (Uniprot release 57.3 May 2009, 26885 entries), in order to increase statistical power of peptide identification, which was supplemented with porcine trypsin, and using SEQUEST algorithm (Bioworks 3.2 package, Thermo Finnigan). For SILAC, we used variable (methionine oxidation, lysine and arginine modification of +6 Da) and fixed modifications (cysteine carboxamidomethylation). Precursor mass tolerance was set to 2Da, fragment mass tolerance to 1.2Da and up to two missed cleavage sites for trypsin were allowed. The same collections of MS/MS spectra were also searched against inverted databases constructed from the same target databases. SEQUEST results were analyzed using the probability ratio method 22b, taking into account isoelectric points of peptides to improve peptide identification 9b. False discovery rates (FDR) of peptide identifications were calculated from the search results against the inverted databases using the refined method 9b. The combination of stable isotope labeling (SIL) with mass spectrometry (MS) allows comparison of the abundance of thousands of proteins in complex mixtures. However, interpretation of the large datasets generated by these techniques remains a challenge since appropriate statistical standards are lacking. Here we present the first generally applicable model that accurately explains the behavior of data obtained using current SIL approaches, including 18O, iTRAQ and SILAC labeling, and different MS instruments. The model decomposes the total technical variance into the spectral, peptide and protein variance components, and its general validity was demonstrated by confronting 48 experimental distributions against 18 different null hypotheses. In addition to its general applicability, the performance of the algorithm was similar or better than that of other existing methods. The model also provides a general framework to integrate quantitative information, allowing a comparative analysis of the results obtained from different SIL experiments. The model was applied to the global analysis of protein alterations induced by low H2O2 concentrations in yeast, demonstrating the increased statistical power that may be achieved by rigorous data integration. Our results highlight the importance of establishing an adequate and validated statistical framework for the analysis of high-throughput data. Except for TOF/TOF MS/MS files, protein identification was carried out by searching against a joint Human+YeastSwissprot database (Uniprot release 57.3 May 2009, 26885 entries), in order to increase statistical power of peptide identification, which was supplemented with porcine trypsin, and using SEQUEST algorithm (Bioworks 3.2 package, Thermo Finnigan). For 18O-labeled samples variable (methionine oxidation, lysine and arginine modification of +4 Da) and fixed modifications (cysteine carboxamidomethylation) were used. For iTRAQ, we allowed variable (methionine oxidation) and fixed modifications (cysteine carboxamidomethylation, lysine and N- terminal modification of +144.1020 Da). For SILAC, we used variable (methionine oxidation, lysine and arginine modification of +6 Da) and fixed modifications (cysteine carboxamidomethylation). Precursor mass tolerance was set to 2Da, fragment mass tolerance to 1.2Da and up to two missed cleavage sites for trypsin were allowed. The same collections of MS/MS spectra were also searched against inverted databases constructed from the same target databases. SEQUEST results were analyzed using the probability ratio method 22b, taking into account isoelectric points of peptides to improve peptide identification 9b. False discovery rates (FDR) of peptide identifications were calculated from the search results against the inverted databases using the refined method 9b. TOF/TOF MS/MS spectra were converted to MGF files by using the 4000 Series Explorer V.3.5.1 internal processor and the following parameters: mass range from precursor -20 Da to 60 Da, peak density of 50 peaks per 200 Da with S/N >1 and minimum area >1 with a maximum number of 100 peaks per precursor. These files were analyzed with the Mascot V2.2.06 search engine using the database and search conditions indicated above.

INSTRUMENT(S): LTQ Orbitrap, LTQ

ORGANISM(S): Saccharomyces Cerevisiae (baker's Yeast)

SUBMITTER: Marco Trevisan-Herraz  

LAB HEAD: Jesús Vázquez

PROVIDER: PXD000325 | Pride | 2015-05-11

REPOSITORIES: Pride

altmetric image

Publications


The combination of stable isotope labeling (SIL) with mass spectrometry (MS) allows comparison of the abundance of thousands of proteins in complex mixtures. However, interpretation of the large data sets generated by these techniques remains a challenge because appropriate statistical standards are lacking. Here, we present a generally applicable model that accurately explains the behavior of data obtained using current SIL approaches, including (18)O, iTRAQ, and SILAC labeling, and different M  ...[more]

Similar Datasets

2013-11-22 | PXD000327 | Pride
2013-12-02 | PXD000192 | Pride
2017-04-01 | GSE79103 | GEO
2021-07-26 | PXD024641 | Pride
2014-01-13 | PXD000255 | Pride
2019-12-04 | PXD000583 | Pride
2013-08-28 | PXD000067 | Pride
2020-04-23 | PXD018321 | Pride
2013-06-19 | PXD000188 | Pride
2022-02-15 | PXD000334 | Pride