Dataset Information

Bayesian Hierarchical Model for Protein Identifications.

ABSTRACT: In proteomics, identification of proteins from complex mixtures of proteins extracted from biological samples is an important problem. Among the experimental technologies, Mass-Spectrometry (MS) is the most popular one. Protein identification from MS data typically relies on a "two-step" procedure of identifying the peptide first followed by the separate protein identification procedure next. In this setup, the interdependence of peptides and proteins are neglected resulting in relatively inaccurate protein identification. In this article, we propose a Markov chain Monte Carlo (MCMC) based Bayesian hierarchical model, a first of its kind in protein identification, which integrates the two steps and performs joint analysis of proteins and peptides using posterior probabilities. We remove the assumption of independence of proteins by using clustering group priors to the proteins based on the assumption that proteins sharing the same biological pathway are likely to be present or absent together and are correlated. The complete conditionals of the proposed joint model being tractable, we propose and implement a Gibbs sampling scheme for full posterior inference that provides the estimation and statistical uncertainties of all relevant parameters. The model has better operational characteristics compared to two existing "one-step" procedures on a range of simulation settings as well as on two well-studied datasets.

SUBMITTER: Mitra R

PROVIDER: S-EPMC6519717 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Bayesian Hierarchical Model for Protein Identifications.

Mitra Riten R Gill Ryan R Sikdar Sinjini S Datta Susmita S

Journal of applied statistics 20180325 1

In proteomics, identification of proteins from complex mixtures of proteins extracted from biological samples is an important problem. Among the experimental technologies, Mass-Spectrometry (MS) is the most popular one. Protein identification from MS data typically relies on a "two-step" procedure of identifying the peptide first followed by the separate protein identification procedure next. In this setup, the interdependence of peptides and proteins are neglected resulting in relatively inaccu ...[more]

PMID: 31105371

Similar Datasets

Project description:BACKGROUND:The basket trial evaluates the treatment effect of a targeted therapy in patients with the same genetic or molecular aberration, regardless of their cancer types. Bayesian hierarchical modeling has been proposed to adaptively borrow information across cancer types to improve the statistical power of basket trials. Although conceptually attractive, research has shown that Bayesian hierarchical models cannot appropriately determine the degree of information borrowing and may lead to substantially inflated type I error rates. METHODS:We propose a novel calibrated Bayesian hierarchical model approach to evaluate the treatment effect in basket trials. In our approach, the shrinkage parameter that controls information borrowing is not regarded as an unknown parameter. Instead, it is defined as a function of a similarity measure of the treatment effect across tumor subgroups. The key is that the function is calibrated using simulation such that information is strongly borrowed across subgroups if their treatment effects are similar and barely borrowed if the treatment effects are heterogeneous. RESULTS:The simulation study shows that our method has substantially better controlled type I error rates than the Bayesian hierarchical model. In some scenarios, for example, when the true response rate is between the null and alternative, the type I error rate of the proposed method can be inflated from 10% up to 20%, but is still better than that of the Bayesian hierarchical model. LIMITATION:The proposed design assumes a binary endpoint. Extension of the proposed design to ordinal and time-to-event endpoints is worthy of further investigation. CONCLUSION:The calibrated Bayesian hierarchical model provides a practical approach to design basket trials with more flexibility and better controlled type I error rates than the Bayesian hierarchical model. The software for implementing the proposed design is available at http://odin.mdacc.tmc.edu/~yyuan/index_code.html.

Dataset Information

Bayesian Hierarchical Model for Protein Identifications.

Publications

Bayesian Hierarchical Model for Protein Identifications.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets