Unknown

Dataset Information

0

Seq-InSite: sequence supersedes structure for protein interaction site prediction.


ABSTRACT:

Motivation

Proteins accomplish cellular functions by interacting with each other, which makes the prediction of interaction sites a fundamental problem. As experimental methods are expensive and time consuming, computational prediction of the interaction sites has been studied extensively. Structure-based programs are the most accurate, while the sequence-based ones are much more widely applicable, as the sequences available outnumber the structures by two orders of magnitude. Ideally, we would like a tool that has the quality of the former and the applicability of the latter.

Results

We provide here the first solution that achieves these two goals. Our new sequence-based program, Seq-InSite, greatly surpasses the performance of sequence-based models, matching the quality of state-of-the-art structure-based predictors, thus effectively superseding the need for models requiring structure. The predictive power of Seq-InSite is illustrated using an analysis of evolutionary conservation for four protein sequences.

Availability and implementation

Seq-InSite is freely available as a web server at http://seq-insite.csd.uwo.ca/ and as free source code, including trained models and all datasets used for training and testing, at https://github.com/lucian-ilie/Seq-InSite.

Supplementary information

Supplementary data is available at Bioinformatics online.

SUBMITTER: Hosseini S 

PROVIDER: S-EPMC10796176 | biostudies-literature | 2024 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Seq-InSite: sequence supersedes structure for protein interaction site prediction.

Hosseini SeyedMohsen S   Golding G Brian GB   Ilie Lucian L  

Bioinformatics (Oxford, England) 20240101 1


<h4>Motivation</h4>Proteins accomplish cellular functions by interacting with each other, which makes the prediction of interaction sites a fundamental problem. As experimental methods are expensive and time consuming, computational prediction of the interaction sites has been studied extensively. Structure-based programs are the most accurate, while the sequence-based ones are much more widely applicable, as the sequences available outnumber the structures by two orders of magnitude. Ideally, w  ...[more]

Similar Datasets

| S-EPMC5793808 | biostudies-literature
| S-EPMC4338852 | biostudies-literature
| S-EPMC5002282 | biostudies-literature
| S-EPMC4319528 | biostudies-literature
| S-EPMC5963530 | biostudies-literature
| S-EPMC8665744 | biostudies-literature
| S-EPMC8897833 | biostudies-literature
| S-EPMC4918422 | biostudies-literature
| S-EPMC3867645 | biostudies-literature
| S-EPMC11783280 | biostudies-literature