Unknown

Dataset Information

0

Deep learning the collisional cross sections of the peptide universe from a million experimental values.


ABSTRACT: The size and shape of peptide ions in the gas phase are an under-explored dimension for mass spectrometry-based proteomics. To investigate the nature and utility of the peptide collisional cross section (CCS) space, we measure more than a million data points from whole-proteome digests of five organisms with trapped ion mobility spectrometry (TIMS) and parallel accumulation-serial fragmentation (PASEF). The scale and precision (CV < 1%) of our data is sufficient to train a deep recurrent neural network that accurately predicts CCS values solely based on the peptide sequence. Cross section predictions for the synthetic ProteomeTools peptides validate the model within a 1.4% median relative error (R > 0.99). Hydrophobicity, proportion of prolines and position of histidines are main determinants of the cross sections in addition to sequence-specific interactions. CCS values can now be predicted for any peptide and organism, forming a basis for advanced proteomics workflows that make full use of the additional information.

SUBMITTER: Meier F 

PROVIDER: S-EPMC7896072 | biostudies-literature | 2021 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Deep learning the collisional cross sections of the peptide universe from a million experimental values.

Meier Florian F   Köhler Niklas D ND   Brunner Andreas-David AD   Wanka Jean-Marc H JH   Voytik Eugenia E   Strauss Maximilian T MT   Theis Fabian J FJ   Mann Matthias M  

Nature communications 20210219 1


The size and shape of peptide ions in the gas phase are an under-explored dimension for mass spectrometry-based proteomics. To investigate the nature and utility of the peptide collisional cross section (CCS) space, we measure more than a million data points from whole-proteome digests of five organisms with trapped ion mobility spectrometry (TIMS) and parallel accumulation-serial fragmentation (PASEF). The scale and precision (CV < 1%) of our data is sufficient to train a deep recurrent neural  ...[more]

Similar Datasets

2021-01-18 | PXD019086 | Pride
| S-EPMC10521631 | biostudies-literature
| S-EPMC10840072 | biostudies-literature
| S-EPMC5634518 | biostudies-literature
| S-EPMC7511037 | biostudies-literature
2023-09-21 | PXD043026 | JPOST Repository
| S-EPMC7317366 | biostudies-literature
| S-EPMC8196406 | biostudies-literature
2020-04-22 | GSE142238 | GEO
| S-EPMC7522765 | biostudies-literature