PGlyco: a pipeline for the identification of intact N-glycopeptides by using HCD- and CID-MS/MS and MS3.
ABSTRACT: Confident characterization of the microheterogeneity of protein glycosylation through identification of intact glycopeptides remains one of the toughest analytical challenges for glycoproteomics. Recently proposed mass spectrometry (MS)-based methods still have some defects such as lack of the false discovery rate (FDR) analysis for the glycan identification and lack of sufficient fragmentation information for the peptide identification. Here we proposed pGlyco, a novel pipeline for the identification of intact glycopeptides by using complementary MS techniques: 1) HCD-MS/MS followed by product-dependent CID-MS/MS was used to provide complementary fragments to identify the glycans, and a novel target-decoy method was developed to estimate the false discovery rate of the glycan identification; 2) data-dependent acquisition of MS3 for some most intense peaks of HCD-MS/MS was used to provide fragments to identify the peptide backbones. By integrating HCD-MS/MS, CID-MS/MS and MS3, intact glycopeptides could be confidently identified. With pGlyco, a standard glycoprotein mixture was analyzed in the Orbitrap Fusion, and 309 non-redundant intact glycopeptides were identified with detailed spectral information of both glycans and peptides.
Project description:Campylobacter jejuni is a gastrointestinal pathogen that is able to modify membrane and periplasmic proteins by the N-linked addition of a 7-residue glycan at the strict attachment motif (D/E)XNX(S/T). Strategies for a comprehensive analysis of the targets of glycosylation, however, are hampered by the resistance of the glycan-peptide bond to enzymatic digestion or ?-elimination and have previously concentrated on soluble glycoproteins compatible with lectin affinity and gel-based approaches. We developed strategies for enriching C. jejuni HB93-13 glycopeptides using zwitterionic hydrophilic interaction chromatography and examined novel fragmentation, including collision-induced dissociation (CID) and higher energy collisional (C-trap) dissociation (HCD) as well as CID/electron transfer dissociation (ETD) mass spectrometry. CID/HCD enabled the identification of glycan structure and peptide backbone, allowing glycopeptide identification, whereas CID/ETD enabled the elucidation of glycosylation sites by maintaining the glycan-peptide linkage. A total of 130 glycopeptides, representing 75 glycosylation sites, were identified from LC-MS/MS using zwitterionic hydrophilic interaction chromatography coupled to CID/HCD and CID/ETD. CID/HCD provided the majority of the identifications (73 sites) compared with ETD (26 sites). We also examined soluble glycoproteins by soybean agglutinin affinity and two-dimensional electrophoresis and identified a further six glycosylation sites. This study more than doubles the number of confirmed N-linked glycosylation sites in C. jejuni and is the first to utilize HCD fragmentation for glycopeptide identification with intact glycan. We also show that hydrophobic integral membrane proteins are significant targets of glycosylation in this organism. Our data demonstrate that peptide-centric approaches coupled to novel mass spectrometric fragmentation techniques may be suitable for application to eukaryotic glycoproteins for simultaneous elucidation of glycan structures and peptide sequence.
Project description:Glycoprotein changes occur in not only protein abundance but also the occupancy of each glycosylation site by different glycoforms during biological or pathological processes. Recent advances in mass spectrometry instrumentation and techniques have facilitated analysis of intact glycopeptides in complex biological samples by allowing the users to generate spectra of intact glycopeptides with glycans attached to each specific glycosylation site. However, assigning these spectra, leading to identification of the glycopeptides, is challenging. Here, we report an algorithm, named GPQuest, for site-specific identification of intact glycopeptides using higher-energy collisional dissociation (HCD) fragmentation of complex samples. In this algorithm, a spectral library of glycosite-containing peptides in the sample was built by analyzing the isolated glycosite-containing peptides using HCD LC-MS/MS. Spectra of intact glycopeptides were selected by using glycan oxonium ions as signature ions for glycopeptide spectra. These oxonium-ion-containing spectra were then compared with the spectral library generated from glycosite-containing peptides, resulting in assignment of each intact glycopeptide MS/MS spectrum to a specific glycosite-containing peptide. The glycan occupying each glycosite was determined by matching the mass difference between the precursor ion of intact glycopeptide and the glycosite-containing peptide to a glycan database. Using GPQuest, we analyzed LC-MS/MS spectra of protein extracts from prostate tumor LNCaP cells. Without enrichment of glycopeptides from global tryptic peptides and at a false discovery rate of 1%, 1008 glycan-containing MS/MS spectra were assigned to 769 unique intact N-linked glycopeptides, representing 344 N-linked glycosites with 57 different N-glycans. Spectral library matching using GPQuest assigns the HCD LC-MS/MS generated spectra of intact glycopeptides in an automated and high-throughput manner. Additionally, spectral library matching gives the user the possibility of identifying novel or modified glycans on specific glycosites that might be missing from the predetermined glycan databases.
Project description:Urine is a complex mixture of proteins and waste products and a challenging biological fluid for biomarker discovery. Previous proteomic studies have identified more than 2800 urinary proteins but analyses aimed at unraveling glycan structures and glycosylation sites of urinary glycoproteins are lacking. Glycoproteomic characterization remains difficult because of the complexity of glycan structures found mainly on asparagine (N-linked) or serine/threonine (O-linked) residues. We have developed a glycoproteomic approach that combines efficient purification of urinary glycoproteins/glycopeptides with complementary MS-fragmentation techniques for glycopeptide analysis. Starting from clinical sample size, we eliminated interfering urinary compounds by dialysis and concentrated the purified urinary proteins by lyophilization. Sialylated urinary glycoproteins were conjugated to a solid support by hydrazide chemistry and trypsin digested. Desialylated glycopeptides, released through mild acid hydrolysis, were characterized by tandem MS experiments utilizing collision induced dissociation (CID) and electron capture dissociation fragmentation techniques. In CID-MS(2), Hex(5)HexNAc(4)-N-Asn and HexHexNAc-O-Ser/Thr were typically observed, in agreement with known N-linked biantennary complex-type and O-linked core 1-like structures, respectively. Additional glycoforms for specific N- and O-linked glycopeptides were also identified, e.g. tetra-antennary N-glycans and fucosylated core 2-like O-glycans. Subsequent CID-MS(3), of selected fragment-ions from the CID-MS(2) analysis, generated peptide specific b- and y-ions that were used for peptide identification. In total, 58 N- and 63 O-linked glycopeptides from 53 glycoproteins were characterized with respect to glycan- and peptide sequences. The combination of CID and electron capture dissociation techniques allowed for the exact identification of Ser/Thr attachment site(s) for 40 of 57 putative O-glycosylation sites. We defined 29 O-glycosylation sites which have, to our knowledge, not been previously reported. This is the first study of human urinary glycoproteins where "intact" glycopeptides were studied, i.e. the presence of glycans and their attachment sites were proven without doubt.
Project description:To overcome the challenges in the analysis of protein glycosylation, we have developed a comprehensive and universal tool through permethylation of glycopeptides and their tandem mass spectrometric analysis. This method has the potential to simplify glycoprotein analysis by integrating glycan sequencing and glycopeptide analysis in a single experiment. Moreover, glycans with unique glycosidic linkages, particularly from prokaryotes, which are resistant to enzymatic or chemical release, could also be detected and analyzed by this methodology. Here we present a strategy for the permethylation of intact glycopeptides, obtained via controlled protease digest, and their characterization by using advanced mass spectrometry. We used bovine RNase B, human transferrin, and bovine fetuin as models to demonstrate the feasibility of the method. Remarkably, the glycan patterns, glycosylation site, and their occupancy by N-glycans are all detected and identified in a single experimental procedure. Acquisition on a high resolution tandem-MSn system with fragmentation methodologies such as high-energy collision dissociation (HCD) and collision induced dissociation (CID), provided the complete sequence of the glycan structures attached to the peptides. The behavior of 20 natural amino acids under the basic permethylation conditions was probed by permethylating a library of short synthetic peptides. Our studies indicate that the permethylation imparts simple, limited, and predictable chemical transformations on peptides and do not interfere with the interpretation of MS/MS data. In addition to this, permethylated O-glycans in unreduced form (released by ? elimination) were also detected, allowing us to profile O-linked glycan structures simultaneously.
Project description:The inherent structural complexity and diversity of glycans pose a major analytical challenge to their structural analysis. Radical chemistry has gained considerable momentum in the field of mass spectrometric biomolecule analysis, including proteomics, glycomics, and lipidomics. Herein, seven isomeric disaccharides and two isomeric tetrasaccharides with subtle structural differences are distinguished rapidly and accurately via one-step radical-induced dissociation. The free-radical-activated glycan-sequencing reagent (FRAGS) selectively conjugates to the unique reducing terminus of glycans in which a localized nascent free radical is generated upon collisional activation and simultaneously induces glycan fragmentation. Higher-energy collisional dissociation (HCD) and collision-induced dissociation (CID) are employed to provide complementary structural information for the identification and discrimination of glycan isomers by providing different fragmentation pathways to generate informative, structurally significant product ions. Furthermore, multiple-stage tandem mass spectrometry (MS3 CID) provides supplementary and valuable structural information through the generation of characteristic parent-structure-dependent fragment ions.
Project description:A liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based methodology has been developed to differentiate core- and antennary-fucosylated glycosylation of glycopeptides. Both the glycosylation sites (heterogeneity) and multiple possible glycan occupancy at each site (microheterogeneity) can be resolved via intact glycopeptide analysis. The serum glycoprotein alpha-1-antitrypsin (A1AT) which contains both core- and antennary-fucosylated glycosites was used in this study. Sialidase was used to remove the sialic acids in order to simplify the glycosylation microheterogeneity and to enhance the MS signal of glycopeptides with similar glycan structures. ?1-3,4 galactosidase was used to differentiate core- and antennary-fucosylation. In-source dissociation was found to severely affect the identification and quantification of glycopeptides with low abundance glycan modification. The settings of the mass spectrometer were therefore optimized to minimize the in-source dissociation. A three-step mass spectrometry fragmentation strategy was used for glycopeptide identification, facilitated by pGlyco software annotation and manual checking. The collision energy used for initial glycopeptide fragmentation was found to be crucial for improved detection of oxonium ions and better selection of Y1 ion (peptide+GlcNAc). Structural assignments revealed that all three glycosylation sites of A1AT glycopeptides contain complex N-glycan structures: site Asn70 contains biantennary glycans without fucosylation; site Asn107 contains bi-, tri- and tetra-antennary glycans with both core- and antennary-fucosylation; site Asn271 contains bi- and tri-antennary glycans with both core- and antennary-fucosylation. The relative intensity of core- and antennary-fucosylation on Asn107 was similar to that of the A1AT protein indicating that the glycosylation level of Asn107 is much larger than the other two sites.
Project description:Simultaneous elucidation of the glycan structure and the glycosylation site are needed to reveal the biological function of protein glycosylation. In this study, we employed a recent type of fragmentation termed higher energy collisional dissociation (HCD) to examine fragmentation patterns of intact glycopeptides generated from a mixture of standard glycosylated proteins. The normalized collisional energy (NCE) value for HCD was varied from 30 to 60% to evaluate the optimal conditions for the fragmentation of peptide backbones and glycoconjugates. Our results indicated that HCD with lower NCE values preferentially fragmented the sugar chains attached to the peptides to generate a ladder of neutral loss of monosaccharides, thereby enabling the putative glycan structure characterization. In addition, detection of the oxonium ions enabled unambiguous differentiation of glycopeptides from non-glycopeptides. In contrast, HCD with higher NCE values preferentially fragmented the peptide backbone and, thus, provided information needed for confident peptide identification. We evaluated the HCD approach with alternating NCE parameters for confident characterization of intact N- and O-linked glycopeptides in a single liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis. In addition, we applied a novel data analysis pipeline, so-called GlycoFinder, to form a basis for automated data analysis. Overall, 38 unique intact glycopeptides corresponding to eight glycosylation sites (six N-linked and two O-linked sites) were confidently identified from a standard protein mixture. This approach provided concurrent characterization of both the peptide and the glycan, thereby enabling comprehensive structural characterization of glycoproteins in a single LC-MS/MS analysis.
Project description:Glycosylation is among the most abundant and diverse protein post-translational modifications (PTMs) identified to date. The structural analysis of this PTM is challenging because of the diverse monosaccharides which are not conserved among organisms, the branched nature of glycans, their isomeric structures, and heterogeneity in the glycan distribution at a given site. Glycoproteomics experiments have adopted the traditional high-throughput LC-MSn proteomics workflow to analyze site-specific glycosylation. However, comprehensive computational platforms for data analyses are scarce. To address this limitation, we present a comprehensive, open-source, modular software for glycoproteomics data analysis called GlycoPAT (GlycoProteomics Analysis Toolbox; freely available from www.VirtualGlycome.org/glycopat). The program includes three major advances: (1) "SmallGlyPep," a minimal linear representation of glycopeptides for MSn data analysis. This format allows facile serial fragmentation of both the peptide backbone and PTM at one or more locations. (2) A novel scoring scheme based on calculation of the "Ensemble Score (ES)," a measure that scores and rank-orders MS/MS spectrum for N- and O-linked glycopeptides using cross-correlation and probability based analyses. (3) A false discovery rate (FDR) calculation scheme where decoy glycopeptides are created by simultaneously scrambling the amino acid sequence and by introducing artificial monosaccharides by perturbing the original sugar mass. Parallel computing facilities and user-friendly GUIs (Graphical User Interfaces) are also provided. GlycoPAT is used to catalogue site-specific glycosylation on simple glycoproteins, standard protein mixtures and human plasma cryoprecipitate samples in three common MS/MS fragmentation modes: CID, HCD and ETD. It is also used to identify 960 unique glycopeptides in cell lysates from prostate cancer cells. The results show that the simultaneous consideration of peptide and glycan fragmentation is necessary for high quality MSn spectrum annotation in CID and HCD fragmentation modes. Additionally, they confirm the suitability of GlycoPAT to analyze shotgun glycoproteomics data.
Project description:Confident characterization of intact glycopeptides is a challenging task in mass spectrometry-based glycoproteomics due to microheterogeneity of glycosylation, complexity of glycans, and insufficient fragmentation of peptide bones. Open mass spectral library search is a promising computational approach to peptide identification, but its potential in the identification of glycopeptides has not been fully explored. Here we present pMatchGlyco, a new spectral library search tool for intact N-linked glycopeptide identification using high-energy collisional dissociation (HCD) tandem mass spectrometry (MS/MS) data. In pMatchGlyco, (1) MS/MS spectra of deglycopeptides are used to create spectral library, (2) MS/MS spectra of glycopeptides are matched to the spectra in library in an open (precursor tolerant) manner and the glycans are inferred, and (3) a false discovery rate is estimated for top-scored matches above a threshold. The efficiency and reliability of pMatchGlyco were demonstrated on a data set of mixture sample of six standard glycoproteins and a complex glycoprotein data set generated from human cancer cell line OVCAR3.
Project description:Glycopeptides from a tryptic digest of chicken ovomucoid were enriched using a simplified lectin affinity chromatography (LAC) platform, and characterized by high-resolution mass spectrometry (MS) as well as ion mobility spectrometry (IMS)-MS. The LAC platform effectively enriched the glycoproteome, from which a total of 117 glycopeptides containing 27 glycan forms were identified for this protein. IMS-MS analysis revealed a high degree of glycopeptide site heterogeneity. Comparison of the IMS distributions of the glycopeptides from different charge states reveals that higher charge states allow more structures to be resolved. Presumably the repulsive interactions between charged sites lead to more open configurations, which are more readily separated compared with the more compact, lower charge state forms of the same groups of species. Combining IMS with collision induced dissociation (CID) made it possible to determine the presence of isomeric glycans and to reconstruct their IMS profiles. This study illustrates a workflow involving hybrid techniques for determining glycopeptide site heterogeneity and evaluating structural diversity of glycans and glycopeptides.