Dataset Information

Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition.

ABSTRACT:

Background

Protein fold recognition usually relies on a statistical model of each fold; each model is constructed from an ensemble of natural sequences belonging to that fold. A complementary strategy may be to employ sequence ensembles produced by computational protein design. Designed sequences can be more diverse than natural sequences, possibly avoiding some limitations of experimental databases.

Methodology/principal findings

WE EXPLORE THIS STRATEGY FOR FOUR SCOP FAMILIES: Small Kunitz-type inhibitors (SKIs), Interleukin-8 chemokines, PDZ domains, and large Caspase catalytic subunits, represented by 43 structures. An automated procedure is used to redesign the 43 proteins. We use the experimental backbones as fixed templates in the folded state and a molecular mechanics model to compute the interaction energies between sidechain and backbone groups. Calculations are done with the Proteins@Home volunteer computing platform. A heuristic algorithm is used to scan the sequence and conformational space, yielding 200,000-300,000 sequences per backbone template. The results confirm and generalize our earlier study of SH2 and SH3 domains. The designed sequences ressemble moderately-distant, natural homologues of the initial templates; e.g., the SUPERFAMILY, profile Hidden-Markov Model library recognizes 85% of the low-energy sequences as native-like. Conversely, Position Specific Scoring Matrices derived from the sequences can be used to detect natural homologues within the SwissProt database: 60% of known PDZ domains are detected and around 90% of known SKIs and chemokines. Energy components and inter-residue correlations are analyzed and ways to improve the method are discussed.

Conclusions/significance

For some families, designed sequences can be a useful complement to experimental ones for homologue searching. However, improved tools are needed to extract more information from the designed profiles before the method can be of general use.

SUBMITTER: Schmidt Am Busch M

PROVIDER: S-EPMC2864755 | biostudies-literature | 2010 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition.

Schmidt Am Busch Marcel M Sedano Audrey A Simonson Thomas T

PloS one 20100505 5

<h4>Background</h4>Protein fold recognition usually relies on a statistical model of each fold; each model is constructed from an ensemble of natural sequences belonging to that fold. A complementary strategy may be to employ sequence ensembles produced by computational protein design. Designed sequences can be more diverse than natural sequences, possibly avoiding some limitations of experimental databases.<h4>Methodology/principal findings</h4>WE EXPLORE THIS STRATEGY FOR FOUR SCOP FAMILIES: S ...[more]

PMID: 20463972

Dataset Information

Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition.

Background

Methodology/principal findings

Conclusions/significance

Publications

Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

TMFoldRec: a statistical potential-based transmembrane protein fold recognition tool.
| S-EPMC4486421 | biostudies-literature

Computational Design of Radical Recognition Assay with the Possible Application of Cyclopropyl Vinyl Sulfides as Tunable Sensors.
| S-EPMC8306039 | biostudies-literature

Triplet state homoaromaticity: concept, computational validation and experimental relevance.
| S-EPMC5916107 | biostudies-literature

FRalanyzer: a tool for functional analysis of fold-recognition sequence-structure alignments.
| S-EPMC1933221 | biostudies-literature

Sequence context-specific profiles for homology searching.
| S-EPMC2645910 | biostudies-literature

LOMETS2: improved meta-threading server for fold-recognition and structure-based function annotation for distant-homology proteins.
| S-EPMC6602514 | biostudies-literature

DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins.
| S-EPMC7141871 | biostudies-literature

Cooperative RecA clustering: the key to efficient homology searching.
| S-EPMC5714135 | biostudies-literature

Computational investigations on target-site searching and recognition mechanisms by thymine DNA glycosylase during DNA repair process.
| S-EPMC9828053 | biostudies-literature

Expanding the space of protein geometries by computational design of de novo fold families.
| S-EPMC7787817 | biostudies-literature