Unknown

Dataset Information

0

InfersentPPI: Prediction of Protein-Protein Interaction Using Protein Sentence Embedding With Gene Ontology Information.


ABSTRACT: Protein-protein interaction (PPI) prediction is meaningful work for deciphering cellular behaviors. Although many kinds of data and machine learning algorithms have been used in PPI prediction, the performance still needs to be improved. In this paper, we propose InferSentPPI, a sentence embedding based text mining method with gene ontology (GO) information for PPI prediction. First, we design a novel weighting GO term-based protein sentence representation method to generate protein sentences including multi-semantic information in the preprocessing. Gene ontology annotation (GOA) provides the reliability of relationships between proteins and GO terms for PPI prediction. Thus, GO term-based protein sentence can help to improve the prediction performance. Then we also propose an InferSent_PN algorithm based on the protein sentences and InferSent algorithm to extract relations between proteins. In the experiments, we evaluate the effectiveness of InferSentPPI with several benchmarking datasets. The result shows our proposed method has performed better than the state-of-the-art methods for a large PPI dataset.

SUBMITTER: Li M 

PROVIDER: S-EPMC8995897 | biostudies-literature | 2022

REPOSITORIES: biostudies-literature

altmetric image

Publications

InfersentPPI: Prediction of Protein-Protein Interaction Using Protein Sentence Embedding With Gene Ontology Information.

Li Meijing M   Jiang Yingying Y   Ryu Keun Ho KH  

Frontiers in genetics 20220328


Protein-protein interaction (PPI) prediction is meaningful work for deciphering cellular behaviors. Although many kinds of data and machine learning algorithms have been used in PPI prediction, the performance still needs to be improved. In this paper, we propose InferSentPPI, a sentence embedding based text mining method with gene ontology (GO) information for PPI prediction. First, we design a novel weighting GO term-based protein sentence representation method to generate protein sentences in  ...[more]

Similar Datasets

| S-EPMC10917077 | biostudies-literature
| S-EPMC7739483 | biostudies-literature
| S-EPMC6350067 | biostudies-literature
| S-EPMC3086830 | biostudies-literature
| S-EPMC1449908 | biostudies-literature
| S-EPMC6884647 | biostudies-literature
| S-EPMC1941744 | biostudies-literature
| S-EPMC1892087 | biostudies-literature
| S-EPMC5293240 | biostudies-literature
| S-EPMC3176917 | biostudies-literature