Project description:Benchmarking Proteomics Quantitation in DIA-type data using real patient material to create a benchmark dataset comprising inter-patient heterogeneity
Project description:Recent advancements in liquid chromatography-mass spectrometry (LC-MS) have increasingly focused on high-throughput workflows, leveraging rapid chromatographic gradients and minimal sample input to maximize proteome coverage from limited material. This shift is particularly driven by the rise of single-cell proteomics, where sensitivity and reproducibility are critical. Building on our previous benchmark dataset (PXD028735), we now present an expanded study utilizing the latest generation of LC-MS platforms optimized for high-throughput proteomics. This study features shorter LC gradients and lower sample input to address the growing need for rapid and sensitive proteome analysis. Using a standardized hybrid proteome mixture with defined ratios of Human, Yeast, and E. coli, we generated a comprehensive Data-Dependent and Data-Independent Acquisition (DDA/DIA) dataset across multiple state-of-the-art LC-MS platforms. The updated dataset incorporates the latest acquisition methodologies and extends coverage across an even broader range of data formats, including enhanced ion mobility-enabled and scanning quadrupole-based acquisitions. Our results providea detailed assessment of the impact of technological advancements and demonstrate how shortening LC gradients influence proteome coverage, quantitative precision, and data consistency across instruments
Project description:Recent advancements in liquid chromatography-mass spectrometry (LC-MS) have increasingly focused on high-throughput workflows, leveraging rapid chromatographic gradients and minimal sample input to maximize proteome coverage from limited material. This shift is particularly driven by the rise of single-cell proteomics, where sensitivity and reproducibility are critical. Building on our previous benchmark dataset (PXD028735), we now present an expanded study utilizing the latest generation of LC-MS platforms optimized for high-throughput proteomics. This study features shorter LC gradients and lower sample input to address the growing need for rapid and sensitive proteome analysis. Using a standardized hybrid proteome mixture with defined ratios of Human, Yeast, and E. coli, we generated a comprehensive Data-Dependent and Data-Independent Acquisition (DDA/DIA) dataset across multiple state-of-the-art LC-MS platforms. The updated dataset incorporates the latest acquisition methodologies and extends coverage across an even broader range of data formats, including enhanced ion mobility-enabled and scanning quadrupole-based acquisitions. Our results providea detailed assessment of the impact of technological advancements and demonstrate how shortening LC gradients influence proteome coverage, quantitative precision, and data consistency across instruments
Project description:Recent advances in liquid chromatography–mass spectrometry (LC-MS) have accelerated the adoption of high-throughput workflows that deliver deep proteome coverage using minimal sample amounts. This trend is largely driven by single-cell proteomics, where sensitivity and reproducibility are essential. Here, we extend our previous benchmark dataset (PXD028735) that was generated using next-generation LC-MS platforms optimized for rapid proteome analysis. With shorter LC gradients and lower sample amounts, we generated an extensive DDA/DIA dataset on a standardized human-yeast-E. coli hybrid proteome. This new dataset includes data acquired by the Orbitrap Astral, which combines an Orbitrap with a time-of-flight (TOF) mass analyzer, and features new scanning quadrupole-based implementations, extending coverage across different instruments and acquisition strategies. Our comprehensive evaluation highlights how technological advances and reduced LC gradients affect proteome depth, quantitative precision, and cross-instrument consistency. The release of this benchmark dataset via ProteomeXchange (PXD070049), allows for the acceleration of cross-platform algorithm development, enhance data mining strategies, and support the continued standardization of short-gradient, high-throughput LC-MS-based proteomics.
Project description:In the last decade, a revolution in liquid chromatography-mass spectrometry (LC-MS) based proteomics was unfolded with the introduction of dozens of novel instruments that incorporate additional data dimensions through innovative acquisition methodologies, in turn inspiring specialized data analysis pipelines. Simultaneously, a growing number of proteomics datasets have been made publicly available through data repositories such as ProteomeXchange, Zenodo and Skyline Panorama. However, developing algorithms to mine this data and assessing the performance on different platforms is currently hampered by the lack of a single benchmark experimental design. Therefore, we acquired a hybrid proteome mixture on different instrument platforms and in all currently available families of data acquisition. Here, we present a comprehensive Data-Dependent and Data-Independent Acquisition (DDA/DIA) dataset acquired using several of the most commonly used current day instrumental platforms. The dataset consists of over 700 LC-MS runs, including adequate replicates allowing robust statistics and covering over nearly 10 different data formats, including scanning quadrupole and ion mobility enabled acquisitions. Datasets are available via ProteomeXchange (PXD028735).
Project description:To unbiasedly evaluate the quantitative performance of different quantitative methods, and compare different popular proteomics data processing workflows, we prepared a benchmark dataset where the various levels of spikeed-in E. Coli proteome that true fold change (i.e. 1 fold, 1.5 fold, 2 fold, 2.5 fold and 3 fold) and true identities of positives/negatives (i.e. E.Coli proteins are true positives while Human proteins are true negatives) are known. To best mimic the proteomics application in comparison of multiple replicates, each fold change group contains 4 replicates, so there are 20 LC-MS/MS analysis in this benchmark dataset. To our knowledge, this spike-in benchmark dataset is largest-scale ever that encompasses 5 different spike level, >500 true positive proteins, and >3000 true negative proteins (2peptide criteria, 1% protein FDR), with a wide concentration dynamic range. The dataset is ideal to test quantitative accuracy, precision, false-positive biomarker discovery and missing data level.