Unknown

Dataset Information

0

Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations.


ABSTRACT: Domains are functional and structural units of proteins that govern various biological functions performed by the proteins. Therefore, the characterization of domains in a protein can serve as a proper functional representation of proteins. Here, we employ a self-supervised protocol to derive functionally consistent representations for domains by learning domain-Gene Ontology (GO) co-occurrences and associations. The domain embeddings we constructed turned out to be effective in performing actual function prediction tasks. Extensive evaluations showed that protein representations using the domain embeddings are superior to those of large-scale protein language models in GO prediction tasks. Moreover, the new function prediction method built on the domain embeddings, named Domain-PFP, significantly outperformed the state-of-the-art function predictors. Additionally, Domain-PFP demonstrated competitive performance in the CAFA3 evaluation, achieving overall the best performance among the top teams that participated in the assessment.

SUBMITTER: Ibtehaz N 

PROVIDER: S-EPMC10473699 | biostudies-literature | 2023 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations.

Ibtehaz Nabil N   Kagaya Yuki Y   Kihara Daisuke D  

bioRxiv : the preprint server for biology 20230824


Domains are functional and structural units of proteins that govern various biological functions performed by the proteins. Therefore, the characterization of domains in a protein can serve as a proper functional representation of proteins. Here, we employ a self-supervised protocol to derive functionally consistent representations for domains by learning domain-Gene Ontology (GO) co-occurrences and associations. The domain embeddings we constructed turned out to be effective in performing actua  ...[more]

Similar Datasets

| S-EPMC10618451 | biostudies-literature
| S-EPMC9556876 | biostudies-literature
| S-EPMC6394400 | biostudies-literature
| S-EPMC3584934 | biostudies-literature
| S-EPMC10187222 | biostudies-literature
| S-EPMC4287954 | biostudies-literature
| S-EPMC2242549 | biostudies-literature
| S-EPMC11794456 | biostudies-literature
| S-EPMC5823857 | biostudies-literature
| S-EPMC10266766 | biostudies-literature