Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics.
ABSTRACT: BACKGROUND: High-throughput peptide and protein identification technologies have benefited tremendously from strategies based on tandem mass spectrometry (MS/MS) in combination with database searching algorithms. A major problem with existing methods lies within the significant number of false positive and false negative annotations. So far, standard algorithms for protein identification do not use the information gained from separation processes usually involved in peptide analysis, such as retention time information, which are readily available from chromatographic separation of the sample. Identification can thus be improved by comparing measured retention times to predicted retention times. Current prediction models are derived from a set of measured test analytes but they usually require large amounts of training data. RESULTS: We introduce a new kernel function which can be applied in combination with support vector machines to a wide range of computational proteomics problems. We show the performance of this new approach by applying it to the prediction of peptide adsorption/elution behavior in strong anion-exchange solid-phase extraction (SAX-SPE) and ion-pair reversed-phase high-performance liquid chromatography (IP-RP-HPLC). Furthermore, the predicted retention times are used to improve spectrum identifications by a p-value-based filtering approach. The approach was tested on a number of different datasets and shows excellent performance while requiring only very small training sets (about 40 peptides instead of thousands). Using the retention time predictor in our retention time filter improves the fraction of correctly identified peptide mass spectra significantly. CONCLUSION: The proposed kernel function is well-suited for the prediction of chromatographic separation in computational proteomics and requires only a limited amount of training data. The performance of this new method is demonstrated by applying it to peptide retention time prediction in IP-RP-HPLC and prediction of peptide sample fractionation in SAX-SPE. Finally, we incorporate the predicted chromatographic behavior in a p-value based filter to improve peptide identifications based on liquid chromatography-tandem mass spectrometry.
Project description:Reversed-phase liquid chromatography is the most commonly used separation method for shotgun proteomics. Nanoflow chromatography has emerged as the preferred chromatography method for its increased sensitivity and separation. Despite its common use, there are a wide range of parameters and conditions used across research groups. These parameters have an effect on the quality of the chromatographic separation, which is critical to maximizing the number of peptide identifications and minimizing ion suppression. Here we examined the relationship between column lengths, gradient lengths, peptide identifications, and peptide peak capacity. We found that while longer column and gradient lengths generally increase peptide identifications, the degree of improvement is dependent on both parameters and is diminished at longer column and gradients. Peak capacity, in comparison, showed a more linear increase with column and gradient lengths. We discuss the discrepancy between these two results and some of the considerations that should be taken into account when deciding on the chromatographic conditions for a proteomics experiment.
Project description:A methodology was implemented for purifying peptides in one chromatographic run via solid-phase extraction (SPE), reverse phase mode (RP), and gradient elution, obtaining high-purity products with good yields. Crude peptides were analyzed by reverse phase high performance liquid chromatography and a new mathematical model based on its retention time was developed in order to predict the percentage of organic modifier in which the peptide will elute in RP-SPE. This information was used for designing the elution program of each molecule. It was possible to purify peptides with different physicochemical properties, showing that this method is versatile and requires low solvent consumption, making it the least polluting one. Reverse phase-SPE can easily be routinely implemented. It is an alternative to enrich and purified synthetic or natural molecules.
Project description:A procedure involving microwave-assisted extraction (MAE) followed by solid-phase extraction (SPE) was established for the extraction and purification of three bisbenzylisoquinoline alkaloids from Stephania cepharantha, and a reversed-phase high-performance liquid chromatography (HPLC) method was developed for the quantification of the target alkaloids. Chromatographic separation was achieved on a Phenomenex Luna Phenyl-Hexyl column. Prior to the HPLC analysis, the alkaloids were rapidly extracted by an optimized MAE process using 0.01 mol/L hydrochloric acid as the solvent. The MAE extract was subsequently purified by SPE using a cation-exchange polymeric cartridge. The MAE-SPE procedure extracted the three alkaloids with satisfactory recoveries ranging from 100.44 to 102.12%. In comparison with the MAE, Soxhlet and ultrasonic-assisted extractions, the proposed MAE-SPE method showed satisfactory cleanup efficiency. Thus, the validated MAE-SPE-HPLC method is specific, accurate and applicable to the determination of alkaloids in S. cepharantha.
Project description:Capitalizing on the massive increase in sample concentrations which are produced by extremely low elution volumes, nanoliquid chromatography-electrospray ionization-tandem mass spectrometry (nano-LC-ESI-MS/MS) is currently one of the most sensitive analytical technologies for the comprehensive characterization of complex protein samples. However, despite tremendous technological improvements made in the production and the packing of monodisperse spherical particles for nanoflow high-pressure liquid chromatography (HPLC), current state-of-the-art systems still suffer from limits in operation at the maximum potential of the technology. With the recent introduction of the ?PAC system, which provides perfectly ordered micropillar array based chromatographic support materials, completely new chromatographic concepts for optimization toward the needs of ultrasensitive proteomics become available. Here we report on a series of benchmarking experiments comparing the performance of a commercially available 50 cm micropillar array column to a widely used nanoflow HPLC column for the proteomics analysis of 10 ng of tryptic HeLa cell digest. Comparative analysis of LC-MS/MS-data corroborated that micropillar array cartridges provide outstanding chromatographic performance, excellent retention time stability, and increased sensitivity in the analysis of low-input proteomics samples and thus repeatedly yielded almost twice as many unique peptide and unique protein group identifications when compared to conventional nanoflow HPLC columns.
Project description:Rejection of false positive peptide matches in database searches of shotgun proteomic experimental data is highly desirable. Several methods have been developed to use the peptide retention time as to refine and improve peptide identifications from database search algorithms. This report describes the implementation of an automated approach to reduce false positives and validate peptide matches.A robust linear regression based algorithm was developed to automate the evaluation of peptide identifications obtained from shotgun proteomic experiments. The algorithm scores peptides based on their predicted and observed reversed-phase liquid chromatography retention times. The robust algorithm does not require internal or external peptide standards to train or calibrate the linear regression model used for peptide retention time prediction. The algorithm is generic and can be incorporated into any database search program to perform automated evaluation of the candidate peptide matches based on their retention times. It provides a statistical score for each peptide match based on its retention time.Analysis of peptide matches where the retention time score was included resulted in a significant reduction of false positive matches with little effect on the number of true positives. Overall higher sensitivities and specificities were achieved for database searches carried out with MassMatrix, Mascot and X!Tandem after implementation of the retention time based score algorithm.
Project description:One of the main challenges in high-throughput serum profiling by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is the development of proteome fractionation approaches that allow the acquisition of reproducible profiles with a maximum number of spectral features and minimum interferences from biological matrices. This study evaluates a new class of solid-phase extraction (SPE) pipette tips embedded with different chromatographic media for fractionation of model protein digests and serum samples. The materials embedded include strong anion exchange (SAX), weak cation exchange (WCX), C18, C8, C4, immobilized metal affinity chromatography (IMAC) and zirconium dioxide particles. Simple and rapid serum proteome profiling protocols based on these SPE micro tips are described and tested using a variety of MALDI matrices. We show that different types of particle-embedded SPE micro tips provide complementary information in terms of the spectral features detected for beta-casein digests and control human serum samples. The effect of different sample pretreatments, such as serum dilution and ultrafiltration using molecular weight cut-off membranes, and the reproducibility observed for replicate experiments, are also evaluated. The results demonstrate the usefulness of these simple SPE tips combined with offline MALDI-TOF MS for obtaining information-rich serum profiles, resulting in a robust, versatile and reproducible open-source platform for serum biomarker discovery.
Project description:Background: Determination of psychotropic drugs in clinical study is significant, and the establishment of methodologies for these drugs in biological matrices is essential for patients' safety. The search for new methods for their detection is one of the most important challenges of modern scientific research. The methods for analyzing of psychotropic drugs and their metabolites in different biological samples should be based on combining a very efficient separation technique including high-performance liquid chromatography (HPLC), with a sensitive detection method and effectively sample preparation methods. Objective: Retention, peaks symmetry and system efficiency of vortioxetine on Hydro RP, Polar RP, HILIC A (with silica stationary phase), HILIC-B (with aminopropyl stationary phase), and ACE HILIC-N (with polyhydroxy stationary phase and SCX columns were investigated. Various mobile phases containing methanol or acetonitrile as organic modifiers and different additives were also applied to obtained optimal retention, peaks shape, and systems efficiency. The best chromatographic procedure was used for simultaneous analysis of vortioxetine and its metabolites in human serum, urine and saliva samples. Methods: Analysis of vortioxetine was performed in various chromatographic systems: Reversed phase (RP) systems on alkylbonded or phenyl stationary phases, hydrophilic interaction liquid chromatography (HILIC), and ion-exchange chromatography (IEC). Based on the dependence of log k vs the concentration of the organic modifier, log kw values for vortioxetine in various chromatographic systems were determined and compared with calculated log P values. Solid phase extraction (SPE) method was applied for sample pre-treatment before HPLC analysis. HPLC-QTOF-MS method was applied for confirmation of presence of vortioxetine and some its metabolites in biological samples collected from psychiatric patient. Conclusions: Differences were observed in retention parameters with a change of the applied chromatographic system. The various properties of stationary phases resulted in differences in vortioxetine retention, systems' efficiency, and peaks' shape. Lipophilicity parameters were also determined using different HPLC conditions. The most optimal systems were chosen for the analysis of vortioxetine in biological samples. Both serum and urine or saliva samples collected from patients treated with vortioxetine can be used for the drug determination. For the first time, vortioxetine was detected in patient's saliva. Obtained results indicate on possibility of application of saliva samples, which collection are non-invasive and painless, for determination and therapeutic drug monitoring in patients.
Project description:In this work, the synthesis, characterization, and application of novel parabens imprinted polymers as highly selective solid-phase extraction (SPE) sorbents have been reported. The imprinted polymers were created using sol-gel molecular imprinting process. All the seven parabens were considered herein in order to check the phase selectivity. By means of a validated HPLC-photodiode array detector (PDA) method all seven parabens were resolved in a single chromatographic run of 25 min. These SPE sorbents, <i>in-house</i> packed in SPE empty cartridges, were first characterized in terms of extraction capability, breakthrough volume, retention volume, hold-up volume, number of theoretical plates, and retention factor. Finally, the device was applied to a real urine sample to check the method feasibility on a very complex matrix. The new paraben imprinted SPE sorbents, not yet present in the literature, potentially encourage the development of novel molecularly imprinted polymers (MIPs) to enhance the extraction efficiency, and consequently the overall analytical performances, when the trace quantification is required.
Project description:There is an increased demand for comprehensive analysis of vitamin D metabolites. This is a major challenge, especially for 1?,25-dihydroxyvitamin D [1?,25(OH)<sub>2</sub>VitD], because it is biologically active at picomolar concentrations. 4-Phenyl-1,2,4-triazoline-3,5-dione (PTAD) was a revolutionary reagent in dramatically increasing sensitivity of all diene metabolites and allowing the routine analysis of the bioactive, but minor, vitamin D metabolites. A second generation of reagents used large fixed charge groups that increased sensitivity at the cost of a deterioration in chromatographic separation of the vitamin D derivatives. This precludes a survey of numerous vitamin D metabolites without redesigning the chromatographic system used. 2-Nitrosopyridine (PyrNO) demonstrates that one can improve ionization and gain higher sensitivity over PTAD. The resulting vitamin D derivatives facilitate high-resolution chromatographic separation of the major metabolites. Additionally, a liquid-liquid extraction followed by solid-phase extraction (LLE-SPE) was developed to selectively extract 1?,25(OH)<sub>2</sub>VitD, while reducing 2- to 4-fold ion suppression compared with SPE alone. LLE-SPE followed by PyrNO derivatization and LC/MS/MS analysis is a promising new method for quantifying vitamin D metabolites in a smaller sample volume (100 µL of serum) than previously reported methods. The PyrNO derivatization method is based on the Diels-Alder reaction and thus is generally applicable to a variety diene analytes.
Project description:Modern nano-HPLC systems are capable of extremely precise control of solvent gradients, allowing high-resolution separation of peptides. Most proteomics laboratories use a simple linear analytical gradient for nano-LC-MS/MS experiments, though recent evidence indicates that optimized non-linear gradients result in increased peptide and protein identifications from cell lysates. In concurrent work, we examined non-linear gradients for the analysis of samples fractionated at the peptide level, where the distribution of peptide retention times often varies by fraction. We hypothesized that greater coverage of these samples could be achieved using per-fraction optimized gradients. We demonstrate that the optimized gradients improve the distribution of peptides throughout the analysis. Using previous generation MS instrumentation, a considerable gain in peptide and protein identifications can be realized. With current MS platforms that have faster electronics and achieve shorter duty cycle, the improvement in identifications is smaller. Our gradient optimization method has been implemented in a simple graphical tool (GOAT) that is MS-vendor independent, does not require peptide ID input, and is freely available for non-commercial use at http://proteomics.swmed.edu/goat/