Proteomics

Dataset Information

0

Statistical models for the analysis of isobaric Tags multiplexed quantitative proteomics


ABSTRACT: ABSTRACT: Mass spectrometry is being used to identify protein biomarkers that can facilitate development of drug treatment. Mass spectrometry based proteomics results in complex proteomic data that is hierarchical in nature often with small sample size studies. Generalized linear models (GLM) is the most popular approach in proteomics to compare protein abundances between groups. However, GLM does not address all the complexities of proteomics data such as repeated measures and variance heterogeneity. Linear Models for Microarray Data (LIMMA) and mixed models are two approaches that can address some of these data complexities to provide better statistical estimates. We compared these three statistical models to demonstrate when each approach is the best. We evaluated these methods using a dataset of known protein abundances, Systemic Lupus Erythematosus (SLE) dataset, and simulated dataset. We found in general the mixed model findings to be a subset of GLM findings which were a subset of LIMMA findings. Regardless of peptides/PSM/Fold-change restrictions or FDR, less findings were removed from the mixed model than LIMMA since the mixed model is more likely to identify proteins with a larger fold change. Although the peptides/PSM restrictions led to less findings (but higher percentage of findings), with combined FDR the findings were the same or had a large overlap with no restriction and FDR findings. As the percentage of findings were higher with the restrictions this indicated these may be the more reliable proteins. The conclusion is that the mixed model was the most protective of the type I error with the smaller MSE while LIMMA had the better overall statistical properties.

INSTRUMENT(S): Orbitrap Fusion, Q Exactive

ORGANISM(S): Homo Sapiens (human) Escherichia Coli

TISSUE(S): Blood Serum

SUBMITTER: Raghothama Chaerkady  

LAB HEAD: Raghothama Chaerkady

PROVIDER: PXD005486 | Pride | 2017-07-28

REPOSITORIES: Pride

altmetric image

Publications

Statistical Models for the Analysis of Isobaric Tags Multiplexed Quantitative Proteomics.

D'Angelo Gina G   Chaerkady Raghothama R   Yu Wen W   Hizal Deniz Baycin DB   Hess Sonja S   Zhao Wei W   Lekstrom Kristen K   Guo Xiang X   White Wendy I WI   Roskos Lorin L   Bowen Michael A MA   Yang Harry H  

Journal of proteome research 20170818 9


Mass spectrometry is being used to identify protein biomarkers that can facilitate development of drug treatment. Mass spectrometry-based labeling proteomic experiments result in complex proteomic data that is hierarchical in nature often with small sample size studies. The generalized linear model (GLM) is the most popular approach in proteomics to compare protein abundances between groups. However, GLM does not address all the complexities of proteomics data such as repeated measures and varia  ...[more]

Similar Datasets

2024-02-23 | PXD039054 | Pride
2021-01-20 | PXD021991 | Pride
2021-08-27 | PXD021901 | Pride
2017-05-19 | PXD006468 | Pride
2021-11-02 | PXD018711 | Pride
2011-01-15 | E-GEOD-25972 | biostudies-arrayexpress
2018-02-07 | PXD003669 | Pride
2021-06-29 | PXD022267 | Pride
2023-04-20 | PXD028171 | Pride
2020-08-24 | PXD018349 | Pride