Proteomics

Dataset Information

0

ComBat HarmonizR enables the integrated analysis of independently generated proteomic datasets through data harmonization with appropriate handling of missing values


ABSTRACT: The integration of proteomic datasets, generated by non-cooperating laboratories using different LC-MS/MS setups can overcome limitations in statistically underpowered sample cohorts but has not been demonstrated to this day. In proteomics, differences in sample preservation and preparation strategies, chromatography and mass spectrometry approaches and the used quantification strategy distort protein abundance distributions in integrated datasets. The Removal of these technical batch effects requires setup-specific normalization and strategies that can deal with missing at random (MAR) and missing not at random (MNAR) type values at a time. Algorithms for batch effect removal, such as the ComBat-algorithm, commonly used for other omics types, disregard proteins with MNAR missing values and reduce the informational yield and the effect size for combined datasets significantly. Here, we present a strategy for data harmonization across different tissue preservation techniques, LC-MS/MS instrumentation setups and quantification approaches. To enable batch effect removal without the need for data reduction or error-prone imputation we developed an extension to the ComBat algorithm, ´ComBat HarmonizR, that performs data harmonization with appropriate handling of MAR and MNAR missing values by matrix dissection The ComBat HarmonizR based strategy enables the combined analysis of independently generated proteomic datasets for the first time. Furthermore, we found ComBat HarmonizR to be superior for removing batch effects between different Tandem Mass Tag (TMT)-plexes, compared to commonly used internal reference scaling (iRS). Due to the matrix dissection approach without the need of data imputation, the HarmonizR algorithm can be applied to any type of -omics data while assuring minimal data loss

INSTRUMENT(S): TripleTOF 6600, Orbitrap Fusion, Q Exactive

ORGANISM(S): Homo Sapiens (human) Escherichia Coli Saccharomyces Cerevisiae (baker's Yeast) Mus Musculus (mouse)

TISSUE(S): Brain, Cell Culture

DISEASE(S): Medulloblastoma

SUBMITTER: Hannah Voß  

LAB HEAD: Prof. Dr. Hartmut Schlüter

PROVIDER: PXD027467 | Pride | 2022-05-23

REPOSITORIES: Pride

Similar Datasets

2014-12-11 | E-GEOD-64028 | biostudies-arrayexpress
2009-04-10 | E-GEOD-15026 | biostudies-arrayexpress
2019-04-11 | E-MTAB-6690 | biostudies-arrayexpress
2019-04-11 | E-MTAB-6691 | biostudies-arrayexpress
2019-04-11 | E-MTAB-6692 | biostudies-arrayexpress
2019-04-11 | E-MTAB-6694 | biostudies-arrayexpress
2019-04-11 | E-MTAB-6696 | biostudies-arrayexpress
2019-04-11 | E-MTAB-6697 | biostudies-arrayexpress
2019-04-11 | E-MTAB-6693 | biostudies-arrayexpress
2019-04-11 | E-MTAB-6695 | biostudies-arrayexpress