Proteomics

Dataset Information

0

Mining Missing Protein Landscape in Subcellular Proteome of Human Embryonic Stem Cells


ABSTRACT: Human embryonic stem cells (hESCs) have capacity of self-renewal and multi-lineage differentiation into derivatives of three germ layers that are of clinical importance for regeneration medicine. Despite the significant progress of hESCs study for clinical application, factors responsible for maintaining the undifferentiated and pluripotent nature of hESCs are still largely unknown. The knowledge barrier between human genomic information and its proteomics was proposed due to the ‘missing proteins’(MPs), proteins that do not have a stringent evidence for existence at protein level, in away investigation of MPs has been continued since 2012 as one of the major goals of Chromosome-Centric Human Proteome Project (C-HPP). Nearly 18% of the human proteins are awaiting high-stringency mass spectrometry evidence of their existences in the human proteome. In this study, we proposed subcellular proteomics analysis aimed to identify the nuclear, cytoplasmic and membrane proteins involved in regulating ESC pluripotency and among them to mine missing proteins in the very early cell status. Our study achieved an identification of 11,970 unique proteins (264,088 unique peptides) in hESCs of which 6,138 were annotated as membrane proteins. Compared with large scale proteomic analysis of hESCs this study additionally identified 3,626 unique protein coding genes. This large scale proteomic data mined a total of 296 candidate detections of missing proteins at 1% FDR. Following the strict criteria of HPP data interpretation guidelines (version 2.1.0), a total of 25 Gold Missing Proteins (GMPs) were confidently identified among which 12 of them were verified with synthetic peptides and finally 10 were confirmed with 2 unique peptides. In summary, these analyses suggested that our results covered the most comprehensive proteome and membrane proteomics of hESCs that would be valuable to provide protein-level evidence for missing membrane proteins. Sample Processing Protocol: A subcellular purification was applied to generate nuclear, cytoplasmic and membrane protein fractions using our previously reported 2-step centrifugation method with sucrose. These three protein fractions were subjected to gel-assisted digestion with trypsin followed by peptide fractionation. We employed our developed sensitive high-pH reversed phase (Hp-RP) stop-and-go extraction tip (StageTip) technique to collect 7 RP fractions for the first batch analysis. For the second batch of membrane proteome analysis and both batches of cytoplasmic and nuclear proteomes, we applied High pH reversed-phase liquid chromatography (RPLC) and concatenation to generate 12 (for membrane proteome) and 24 (for nucleus and cytoplasm proteomes) RP fractions, respectively. All the RP fractions were analyzed in duplicate by high-end Orbitrap Fusion™ or Fusion™ Lumos™ Tribrid™ mass spectrometers. Data Processing protocol: Protein identifications were performed for each RP fraction by using Proteome Discoverer 2.1 (PD 2.1) with three search engines, Mascot, SequestHT and MSAmanda, at 1% FDR in peptide, protein and and high confidence PSM levels. To obtain the total list of identified proteins in nuclear, cytoplasmic and membrane proteomes, we adopted the consensus workflow in PD 2.1 to combine all the fraction data with filtering to 1% protein FDR. In order to mine missing proteins from the subcellular proteomes of hECs, we followed CHPP guideline version 2.1.0 with stringent criteria among which (1) detection at 1% FDR (protein, peptide) and with high confidence, unambiguous PSMs; (2) uniqueness of the peptide, alternate peptide mappings, isobaric sequences or SAAVs, checked by the Peptide Uniqueness Checker (nextprot, 20180117 release); (3) unique peptide sequences of length ≥9 amino acid; (4) two or more unique, non-nested peptide sequences were applied entirely for stepwise filtering of candidate detections. For validating missing proteins with synthetic peptides, MRM analysis was conducted by QTrap 5500 system (AB SCIEX Concord, ON) using Trap column (ACQUITY UPLC® symmetry C18, 5 µm, 180 µm x 20mm) with BEH C18-AQ column (nanoACQUITY® 1.7 µm, 75 µm x 250 mm). Membrane protein annotations were used from neXtProt 2018-01 consisting of Human Proteome We further annotated the cellular localizations and the number of TMH for all identified proteins using the latest version of neXtProt (2018-01). Total identified proteins in three subcellular proteome data were analyzed using KEGG & IPA databases for contracting pathway analysis. The Human Proteome Atlas (HPA), Gene Ontology and other informatic tools were used as a source for generating lineage specific analysis.

ORGANISM(S): Homo Sapiens (human)

SUBMITTER: Yu-Ju Chen 

PROVIDER: PXD009840 | JPOST Repository | Mon May 20 00:00:00 BST 2019

REPOSITORIES: jPOST

altmetric image

Publications


Human embryonic stem cells (hESCs) have the capacity for self-renewal and multilineage differentiation, which are of clinical importance for regeneration medicine. Despite the significant progress of hESC study, the complete hESC proteome atlas, especially the surface protein composition, awaits delineation. According to the latest release of neXtProt database (January 17, 2018; 19 658 PE1, 2, 3, and 4 human proteins), membrane proteins present the major category (1047; 48%) among all 2186 missi  ...[more]

Similar Datasets

2022-10-19 | GSE167587 | GEO
2013-04-23 | E-GEOD-44564 | biostudies-arrayexpress
2015-07-27 | PXD002224 | Pride
2009-09-01 | GSE12042 | GEO
2013-04-23 | GSE44564 | GEO
2020-04-22 | GSE140630 | GEO
2005-08-24 | GSE3180 | GEO
2018-05-07 | GSE106862 | GEO
2022-02-24 | PXD025586 | Pride
2005-08-24 | E-GEOD-3180 | biostudies-arrayexpress