Unknown

Dataset Information

0

Precision Neoantigen Discovery Using Large-scale Immunopeptidomes and Composite Modeling of MHC Peptide Presentation.


ABSTRACT: Major histocompatibility complex (MHC)-bound peptides that originate from tumor-specific genetic alterations, known as neoantigens, are an important class of anticancer therapeutic targets. Accurately predicting peptide presentation by MHC complexes is a key aspect of discovering therapeutically relevant neoantigens. Technological improvements in mass-spectrometry-based immunopeptidomics and advanced modeling techniques have vastly improved MHC presentation prediction over the past two decades. However, improvement in the sensitivity and specificity of prediction algorithms is needed for clinical applications such as the development of personalized cancer vaccines, the discovery of biomarkers for response to checkpoint blockade, and the quantification of autoimmune risk in gene therapies. Toward this end, we generated allele-specific immunopeptidomics data using 25 monoallelic cell lines and created Systematic HLA Epitope Ranking Pan Algorithm (SHERPA), a pan-allelic MHC-peptide algorithm for predicting MHC-peptide binding and presentation. In contrast to previously published large-scale monoallelic data, we used an HLA-null K562 parental cell line and a stable transfection of HLA alleles to better emulate native presentation. Our dataset includes five previously unprofiled alleles that expand MHC-binding pocket diversity in the training data and extend allelic coverage in under profiled populations. To improve generalizability, SHERPA systematically integrates 128 monoallelic and 384 multiallelic samples with publicly available immunoproteomics data and binding assay data. Using this dataset, we developed two features that empirically estimate the propensities of genes and specific regions within gene bodies to engender immunopeptides to represent antigen processing. Using a composite model constructed with gradient boosting decision trees, multiallelic deconvolution, and 2.15 million peptides encompassing 167 alleles, we achieved a 1.44-fold improvement of positive predictive value compared with existing tools when evaluated on independent monoallelic datasets and a 1.15-fold improvement when evaluating on tumor samples. With a high degree of accuracy, SHERPA has the potential to enable precision neoantigen discovery for future clinical applications.

SUBMITTER: Pyke RM 

PROVIDER: S-EPMC8318994 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

altmetric image

Publications

Withdrawn: Precision Neoantigen Discovery Using Large-scale Immunopeptidomes and Composite Modeling of MHC Peptide Presentation.

Pyke Rachel Marty RM   Mellacheruvu Dattatreya D   Dea Steven S   Abbott Charles W CW   Zhang Simo V SV   Phillips Nick A NA   Harris Jason J   Bartha Gabor G   Desai Sejal S   McClory Rena R   West John J   Snyder Michael P MP   Chen Richard R   Boyle Sean Michael SM  

Molecular & cellular proteomics : MCP 20210612


This article has been withdrawn by the authors. A publication of the manuscript with the correct figures and tables has been approved and the authors state the conclusions of the manuscript remain unaffected. Specifically, errors are in Figure 6A, Supplementary Figure 10B, Supplementary Figure 10C, and Supplementary Table 5. The details of the errors are as follows: the HLA types for one sample were incorrectly assigned because of a tumor/normal mislabeling from the biobank vendor. Due to the di  ...[more]

Similar Datasets

| S-EPMC10114598 | biostudies-literature
| S-EPMC8810409 | biostudies-literature
| S-EPMC11160895 | biostudies-literature
| S-EPMC11756396 | biostudies-literature
| S-SCDT-10_1038-S44321-025-00225-3 | biostudies-other
| S-EPMC7511449 | biostudies-literature
| S-EPMC10329146 | biostudies-literature
| S-EPMC10028922 | biostudies-literature
| S-EPMC10872456 | biostudies-literature
| S-EPMC9482634 | biostudies-literature