Ontology highlight
ABSTRACT: Summary
Protein structures carry signal of common ancestry and can therefore aid in reconstructing their evolutionary histories. To expedite the structure-informed inference process, a web server, Structome, has been developed that allows users to rapidly identify protein structures similar to a query protein and to assemble datasets useful for structure-based phylogenetics. Structome was created by clustering ∼94% of the structures in RCSB PDB using 90% sequence identity and representing each cluster by a centroid structure. Structure similarity between centroid proteins was calculated, and annotations from PDB, SCOP, and CATH were integrated. To illustrate utility, an H3 histone was used as a query, and results show that the protein structures returned by Structome span both sequence and structural diversity of the histone fold. Additionally, the pre-computed nexus-formatted distance matrix, provided by Structome, enables analysis of evolutionary relationships between proteins not identifiable using searches based on sequence similarity alone. Our results demonstrate that, beginning with a single structure, Structome can be used to rapidly generate a dataset of structural neighbours and allows deep evolutionary history of proteins to be studied.Availability and implementation
Structome is available at: https://structome.bii.a-star.edu.sg.
SUBMITTER: Malik AJ
PROVIDER: S-EPMC10692761 | biostudies-literature | 2023
REPOSITORIES: biostudies-literature
Malik Ashar J AJ Langer Desiree D Verma Chandra S CS Poole Anthony M AM Allison Jane R JR
Bioinformatics advances 20231003 1
<h4>Summary</h4>Protein structures carry signal of common ancestry and can therefore aid in reconstructing their evolutionary histories. To expedite the structure-informed inference process, a web server, Structome, has been developed that allows users to rapidly identify protein structures similar to a query protein and to assemble datasets useful for structure-based phylogenetics. Structome was created by clustering ∼94% of the structures in RCSB PDB using 90% sequence identity and representin ...[more]