Unknown

Dataset Information

0

New alignment method for remote protein sequences by the direct use of pairwise sequence correlations and substitutions.


ABSTRACT: Understanding protein sequences and how they relate to the functions of proteins is extremely important. One of the most basic operations in bioinformatics is sequence alignment and usually the first things learned from these are which positions are the most conserved and often these are critical parts of the structure, such as enzyme active site residues. In addition, the contact pairs in a protein usually correspond closely to the correlations between residue positions in the multiple sequence alignment, and these usually change in a systematic and coordinated way, if one position changes then the other member of the pair also changes to compensate. In the present work, these correlated pairs are taken as anchor points for a new type of sequence alignment. The main advantage of the method here is its combining the remote homolog detection from our method PROST with pairwise sequence substitutions in the rigorous method from Kleinjung et al. We show a few examples of some resulting sequence alignments, and how they can lead to improvements in alignments for function, even for a disordered protein.

SUBMITTER: Jia K 

PROVIDER: S-EPMC10602800 | biostudies-literature | 2023

REPOSITORIES: biostudies-literature

altmetric image

Publications

New alignment method for remote protein sequences by the direct use of pairwise sequence correlations and substitutions.

Jia Kejue K   Kilinc Mesih M   Jernigan Robert L RL  

Frontiers in bioinformatics 20231012


Understanding protein sequences and how they relate to the functions of proteins is extremely important. One of the most basic operations in bioinformatics is sequence alignment and usually the first things learned from these are which positions are the most conserved and often these are critical parts of the structure, such as enzyme active site residues. In addition, the contact pairs in a protein usually correspond closely to the correlations between residue positions in the multiple sequence  ...[more]

Similar Datasets

| S-EPMC6510764 | biostudies-literature
| S-EPMC6137996 | biostudies-literature
| S-EPMC11255384 | biostudies-literature
| S-EPMC6528274 | biostudies-literature
| S-EPMC1579236 | biostudies-literature
| S-EPMC2850363 | biostudies-literature
| S-EPMC8901008 | biostudies-literature
| S-EPMC2668612 | biostudies-literature
| S-EPMC41136 | biostudies-other
| S-EPMC3431198 | biostudies-literature