Unknown

Dataset Information

0

Discovery of coding regions in the human genome by integrated proteogenomics analysis workflow.


ABSTRACT: Proteogenomics enable the discovery of novel peptides (from unannotated genomic protein-coding loci) and single amino acid variant peptides (derived from single-nucleotide polymorphisms and mutations). Increasing the reliability of these identifications is crucial to ensure their usefulness for genome annotation and potential application as neoantigens in cancer immunotherapy. We here present integrated proteogenomics analysis workflow (IPAW), which combines peptide discovery, curation, and validation. IPAW includes the SpectrumAI tool for automated inspection of MS/MS spectra, eliminating false identifications of single-residue substitution peptides. We employ IPAW to analyze two proteomics data sets acquired from A431 cells and five normal human tissues using extended (pH range, 3-10) high-resolution isoelectric focusing (HiRIEF) pre-fractionation and TMT-based peptide quantitation. The IPAW results provide evidence for the translation of pseudogenes, lncRNAs, short ORFs, alternative ORFs, N-terminal extensions, and intronic sequences. Moreover, our quantitative analysis indicates that protein production from certain pseudogenes and lncRNAs is tissue specific.

SUBMITTER: Zhu Y 

PROVIDER: S-EPMC5834625 | biostudies-literature | 2018 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Discovery of coding regions in the human genome by integrated proteogenomics analysis workflow.

Zhu Yafeng Y   Orre Lukas M LM   Johansson Henrik J HJ   Huss Mikael M   Boekel Jorrit J   Vesterlund Mattias M   Fernandez-Woodbridge Alejandro A   Branca Rui M M RMM   Lehtiö Janne J  

Nature communications 20180302 1


Proteogenomics enable the discovery of novel peptides (from unannotated genomic protein-coding loci) and single amino acid variant peptides (derived from single-nucleotide polymorphisms and mutations). Increasing the reliability of these identifications is crucial to ensure their usefulness for genome annotation and potential application as neoantigens in cancer immunotherapy. We here present integrated proteogenomics analysis workflow (IPAW), which combines peptide discovery, curation, and vali  ...[more]

Similar Datasets

| S-EPMC2812506 | biostudies-literature
| S-EPMC19553 | biostudies-literature
| S-EPMC4762527 | biostudies-literature
| S-EPMC6589356 | biostudies-literature
| S-EPMC6262853 | biostudies-other
2016-05-27 | PXD002967 | Pride
| S-EPMC6337836 | biostudies-literature
| S-EPMC4895710 | biostudies-literature