Peptide Selection for Accurate Targeted Protein Quantification via a Dimethylation High-Resolution Mass Spectrum Strategy with a Peptide Release Kinetic Model.
ABSTRACT: A crucial step in accurate targeted protein quantification using targeted proteomics is to determine optimal proteotypic peptides representing targeted proteins. In this study, a workflow of peptide selection to determine proteotypic peptides using a dimethylation high-resolution mass spectrum strategy with a peptide release kinetic model was investigated and applied in peptide selection of bovine serum albumin. After specificity, digestibility, recovery, and stability evaluation of tryptic peptides in bovine serum albumin, the optimal proteotypic peptide was selected as LVNELTEFAK. The quantification method using LVNELTEFAK gave a linear range of 1-100 ppm with the coefficient greater than 0.9990, and the detection limit of bovine serum albumin in milk was 0.78 mg/kg. Compared with the proteotypic peptides selected by Skyline, the method showed a better performance in method validation. The workflow exhibited high comprehensiveness and efficiency in peptide selection, facilitating accurate targeted protein quantification in the food matrix, which lack protein standards.
Project description:Targeted mass spectrometry has become the method of choice to gain absolute quantification information of high quality, which is essential for a quantitative understanding of biological systems. However, the design of absolute protein quantification assays remains challenging due to variations in peptide observability and incomplete knowledge about factors influencing peptide detectability. Here, we present a deep learning algorithm for peptide detectability prediction, d::pPop, which allows the informed selection of synthetic proteotypic peptides for the successful design of targeted proteomics quantification assays. The deep neural network is able to learn a regression model that relates the physicochemical properties of a peptide to its ion intensity detected by mass spectrometry. The approach makes use of experimentally detected deviations from the assumed equimolar abundance of all peptides derived from a given protein. Trained on extensive proteomics datasets, d::pPop's plant and non-plant specific models can predict the quality of proteotypic peptides for not yet experimentally identified proteins. Interrogating the deep neural network after learning from ~76,000 peptides per model organism allows to investigate the impact of different physicochemical properties on the observability of a peptide, thus providing insights into peptide observability as a multifaceted process. Empirical evaluation with rank accuracy metrics showed that our prediction approach outperforms existing algorithms. We circumvent the delicate step of selecting positive and negative training sets and at the same time also more closely reflect the need for selecting the top most promising peptides for targeting a protein of interest. Further, we used an artificial QconCAT protein to experimentally validate the observability prediction. Our proteotypic peptide prediction approach not only facilitates the design of absolute protein quantification assays via a user-friendly web interface but also enables the selection of proteotypic peptides for not yet observed proteins, hence rendering the tool especially useful for plant research.
Project description:Mammalian host response to pathogens is associated with fluctuations in high abundant proteins in body fluids as well as in regulation of proteins expressed in relatively low copy numbers like cytokines secreted from immune cells and endothelium. Hence, efficient monitoring of proteins associated with host response to pathogens remains a challenging task. In this paper, we present a targeted proteome analysis of a panel of 20 proteins that are widely believed to be key players and indicators of bovine host response to mastitis pathogens. Stable isotope-labeled variants of two concordant proteotypic peptides from each of these 20 proteins were obtained through the QconCAT method. We present the quantotypic properties of these 40 proteotypic peptides and discuss their application to research in host-pathogen interactions. Our results clearly demonstrate a robust monitoring of 17 targeted host-response proteins. Twelve of these were readily quantified in a simple extraction of mammary gland tissues, while the expression levels of the remaining proteins were too low for direct and stable quantification; hence, their accurate quantification requires further fractionation of mammary gland tissues.
Project description:Mass spectrometric based methods for absolute quantification of proteins, such as QconCAT, rely on internal standards of stable-isotope labeled reference peptides, or "Q-peptides," to act as surrogates. Key to the success of this and related methods for absolute protein quantification (such as AQUA) is selection of the Q-peptide. Here we describe a novel method, CONSeQuence (consensus predictor for Q-peptide sequence), based on four different machine learning approaches for Q-peptide selection. CONSeQuence demonstrates improved performance over existing methods for optimal Q-peptide selection in the absence of prior experimental information, as validated using two independent test sets derived from yeast. Furthermore, we examine the physicochemical parameters associated with good peptide surrogates, and demonstrate that in addition to charge and hydrophobicity, peptide secondary structure plays a significant role in determining peptide "detectability" in liquid chromatography-electrospray ionization experiments. We relate peptide properties to protein tertiary structure, demonstrating a counterintuitive preference for buried status for frequently detected peptides. Finally, we demonstrate the improved efficacy of the general approach by applying a predictor trained on yeast data to sets of proteotypic peptides from two additional species taken from an existing peptide identification repository.
Project description:The relatively small numbers of proteins and fewer possible post-translational modifications in microbes provide a unique opportunity to comprehensively characterize their dynamic proteomes. We have constructed a PeptideAtlas (PA) covering 62.7% of the predicted proteome of the extremely halophilic archaeon Halobacterium salinarum NRC-1 by compiling approximately 636 000 tandem mass spectra from 497 mass spectrometry runs in 88 experiments. Analysis of the PA with respect to biophysical properties of constituent peptides, functional properties of parent proteins of detected peptides, and performance of different mass spectrometry approaches has highlighted plausible strategies for improving proteome coverage and selecting signature peptides for targeted proteomics. Notably, discovery of a significant correlation between absolute abundances of mRNAs and proteins has helped identify low abundance of proteins as the major limitation in peptide detection. Furthermore, we have discovered that iTRAQ labeling for quantitative proteomic analysis introduces a significant bias in peptide detection by mass spectrometry. Therefore, despite identifying at least one proteotypic peptide for almost all proteins in the PA, a context-dependent selection of proteotypic peptides appears to be the most effective approach for targeted proteomics.
Project description:The use of protein tagging to facilitate detailed characterization of target proteins has not only revolutionized cell biology, but also enabled biochemical analysis through efficient recovery of the protein complexes wherein the tagged proteins reside. The endogenous use of these tags for detailed protein characterization is widespread in lower organisms that allow for efficient homologous recombination. With the recent advances in genome engineering, tagging of endogenous proteins is now within reach for most experimental systems, including mammalian cell lines cultures. In this work, we describe the selection of peptides with ideal mass spectrometry characteristics for use in quantification of tagged proteins using targeted proteomics. We mined the proteome of the hyperthermophile Pyrococcus furiosus to obtain two peptides that are unique in the proteomes of all known model organisms (proteotypic) and allow sensitive quantification of target proteins in a complex background. By combining these 'Proteotypic peptides for Quantification by SRM' (PQS peptides) with epitope tags, we demonstrate their use in co-immunoprecipitation experiments upon transfection of protein pairs, or after introduction of these tags in the endogenous proteins through genome engineering. Endogenous protein tagging for absolute quantification provides a powerful extra dimension to protein analysis, allowing the detailed characterization of endogenous proteins.
Project description:Allelic polymorphism of the apolipoprotein E (ApoE) gene (ApoE ?2, ApoE ?3 and ApoE ?4 alleles) gives rise to three protein isoforms (ApoE2, ApoE3 and ApoE4) that differ by 1 or 2 amino acids. Inheritance of the ApoE ?4 allele is a risk factor for developing Alzheimer's disease (AD). The potential diagnostic value of ApoE protein levels in biological fluids (i.e. cerebrospinal fluid, plasma and serum) for distinguishing between AD patients and healthy elderly subjects is subject to great controversy. Although a recent study reported subnormal total ApoE and ApoE4 levels in the plasma of AD patients, other studies have found normal or even elevated protein levels (versus controls). Because all previously reported assays were based on immunoenzymatic techniques, we decided to develop an orthogonal assay based on targeted mass spectrometry by tracking (i) a proteotypic peptide common to all ApoE isoforms and (ii) a peptide that is specific for the ?4 allele. After trypsin digestion, the ApoE4-specific peptide contains an oxidation-prone methionine residue. The endogenous methionine oxidation level was evaluated in a small cohort (n=68) of heterozygous ?3?4 carriers containing both healthy controls and AD patients. As expected, the proportion of oxidized residues varied from 0 to 10%, with an average of 5%. We therefore developed a standardized strategy for the unbiased, absolute quantification of ApoE4, based on performic acid oxidization of methionine. Once the sample workflow had been thoroughly validated, it was applied to the concomitant quantification of total ApoE and ApoE4 isoform in a large case-control study (n=669). The final measurements were consistent with most previously reported ApoE concentration values and confirm the influence of the different alleles on the protein expression level. Our results illustrate (i) the reliability of selected reaction monitoring-based assays and (ii) the value of the oxidization step for unbiased monitoring of methionine-containing proteotypic peptides. Furthermore, a statistical analysis indicated that neither total ApoE and ApoE4 levels nor the ApoE/ApoE4 ratio correlated with the diagnosis of AD. These findings reinforce the conclusions of previous studies in which plasma ApoE levels had no obvious clinical significance.
Project description:We report a method for high-throughput, cost-efficient empirical discovery of optimal proteotypic peptides and fragment ions for targeted proteomics applications using in vitro-synthesized proteins. We demonstrate the approach using human transcription factors, which are typically difficult, low-abundance targets and empirically derived proteotypic peptides for 98% of the target proteins. We show that targeted proteomic assays developed using our approach facilitate robust in vivo quantification of human transcription factors.
Project description:Systems biology relies on data sets in which the same group of proteins is consistently identified and precisely quantified across multiple samples, a requirement that is only partially achieved by current proteomics approaches. Selected reaction monitoring (SRM)-also called multiple reaction monitoring-is emerging as a technology that ideally complements the discovery capabilities of shotgun strategies by its unique potential for reliable quantification of analytes of low abundance in complex mixtures. In an SRM experiment, a predefined precursor ion and one of its fragments are selected by the two mass filters of a triple quadrupole instrument and monitored over time for precise quantification. A series of transitions (precursor/fragment ion pairs) in combination with the retention time of the targeted peptide can constitute a definitive assay. Typically, a large number of peptides are quantified during a single LC-MS experiment. This tutorial explains the application of SRM for quantitative proteomics, including the selection of proteotypic peptides and the optimization and validation of transitions. Furthermore, normalization and various factors affecting sensitivity and accuracy are discussed.
Project description:Proteomics research is beginning to expand beyond the more traditional shotgun analysis of protein mixtures to include targeted analyses of specific proteins using mass spectrometry. Integral to the development of a robust assay based on targeted mass spectrometry is prior knowledge of which peptides provide an accurate and sensitive proxy of the originating gene product (i.e., proteotypic peptides). To develop a catalog of "proteotypic peptides" in human heart, TRIzol extracts of left-ventricular tissue from nonfailing and failing human heart explants were optimized for shotgun proteomic analysis using Multidimensional Protein Identification Technology (MudPIT). Ten replicate MudPIT analyses were performed on each tissue sample and resulted in the identification of 30 605 unique peptides with a q-value < or = 0.01, corresponding to 7138 unique human heart proteins. Experimental observation frequencies were assessed and used to select over 4476 proteotypic peptides for 2558 heart proteins. This human cardiac data set can serve as a public reference to guide the selection of proteotypic peptides for future targeted mass spectrometry experiments monitoring potential protein biomarkers of human heart diseases.
Project description:Quantitative cross-linking/mass spectrometry (QCLMS) is an emerging approach to study conformational changes of proteins and multi-subunit complexes. Distinguishing protein conformations requires reproducibly identifying and quantifying cross-linked peptides. Here we analyzed the variation between multiple cross-linking reactions using bis[sulfosuccinimidyl] suberate (BS3)-cross-linked human serum albumin (HSA) and evaluated how reproducible cross-linked peptides can be identified and quantified by LC-MS analysis. To make QCLMS accessible to a broader research community, we developed a workflow that integrates the established software tools MaxQuant for spectra preprocessing, Xi for cross-linked peptide identification, and finally Skyline for quantification (MS1 filtering). Out of the 221 unique residue pairs identified in our sample, 124 were subsequently quantified across 10 analyses with coefficient of variation (CV) values of 14% (injection replica) and 32% (reaction replica). Thus our results demonstrate that the reproducibility of QCLMS is in line with the reproducibility of general quantitative proteomics and we establish a robust workflow for MS1-based quantitation of cross-linked peptides. Graphical Abstract ?.