Unknown

Dataset Information

0

Peptide-binding specificity prediction using fine-tuned protein structure prediction networks.


ABSTRACT: Peptide-binding proteins play key roles in biology, and predicting their binding specificity is a long-standing challenge. While considerable protein structural information is available, the most successful current methods use sequence information alone, in part because it has been a challenge to model the subtle structural changes accompanying sequence substitutions. Protein structure prediction networks such as AlphaFold model sequence-structure relationships very accurately, and we reasoned that if it were possible to specifically train such networks on binding data, more generalizable models could be created. We show that placing a classifier on top of the AlphaFold network and fine-tuning the combined network parameters for both classification and structure prediction accuracy leads to a model with strong generalizable performance on a wide range of Class I and Class II peptide-MHC interactions that approaches the overall performance of the state-of-the-art NetMHCpan sequence-based method. The peptide-MHC optimized model shows excellent performance in distinguishing binding and non-binding peptides to SH3 and PDZ domains. This ability to generalize well beyond the training set far exceeds that of sequence-only models and should be particularly powerful for systems where less experimental data are available.

SUBMITTER: Motmaen A 

PROVIDER: S-EPMC9992841 | biostudies-literature | 2023 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Peptide-binding specificity prediction using fine-tuned protein structure prediction networks.

Motmaen Amir A   Dauparas Justas J   Baek Minkyung M   Abedi Mohamad H MH   Baker David D   Bradley Philip P  

Proceedings of the National Academy of Sciences of the United States of America 20230221 9


Peptide-binding proteins play key roles in biology, and predicting their binding specificity is a long-standing challenge. While considerable protein structural information is available, the most successful current methods use sequence information alone, in part because it has been a challenge to model the subtle structural changes accompanying sequence substitutions. Protein structure prediction networks such as AlphaFold model sequence-structure relationships very accurately, and we reasoned t  ...[more]

Similar Datasets

| S-EPMC10705405 | biostudies-literature
| S-EPMC3162530 | biostudies-literature
| S-EPMC9416537 | biostudies-literature
| S-EPMC11427629 | biostudies-literature
| S-EPMC7924501 | biostudies-literature
| S-EPMC8155034 | biostudies-literature
| S-EPMC6029131 | biostudies-literature
| S-EPMC3812049 | biostudies-literature
| S-EPMC10469926 | biostudies-literature
| S-EPMC4908348 | biostudies-literature