Unknown

Dataset Information

0

Metapaths: similarity search in heterogeneous knowledge graphs via meta-paths.


ABSTRACT:

Summary

Heterogeneous knowledge graphs (KGs) have enabled the modeling of complex systems, from genetic interaction graphs and protein-protein interaction networks to networks representing drugs, diseases, proteins, and side effects. Analytical methods for KGs rely on quantifying similarities between entities, such as nodes, in the graph. However, such methods must consider the diversity of node and edge types contained within the KG via, for example, defined sequences of entity types known as meta-paths. We present metapaths, the first R software package to implement meta-paths and perform meta-path-based similarity search in heterogeneous KGs. The metapaths package offers various built-in similarity metrics for node pair comparison by querying KGs represented as either edge or adjacency lists, as well as auxiliary aggregation methods to measure set-level relationships. Indeed, evaluation of these methods on an open-source biomedical KG recovered meaningful drug and disease-associated relationships, including those in Alzheimer's disease. The metapaths framework facilitates the scalable and flexible modeling of network similarities in KGs with applications across KG learning.

Availability and implementation

The metapaths R package is available via GitHub at https://github.com/ayushnoori/metapaths and is released under MPL 2.0 (Zenodo DOI: 10.5281/zenodo.7047209). Package documentation and usage examples are available at https://www.ayushnoori.com/metapaths.

SUBMITTER: Noori A 

PROVIDER: S-EPMC10209523 | biostudies-literature | 2023 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Metapaths: similarity search in heterogeneous knowledge graphs via meta-paths.

Noori Ayush A   Li Michelle M MM   Tan Amelia L M ALM   Zitnik Marinka M  

Bioinformatics (Oxford, England) 20230501 5


<h4>Summary</h4>Heterogeneous knowledge graphs (KGs) have enabled the modeling of complex systems, from genetic interaction graphs and protein-protein interaction networks to networks representing drugs, diseases, proteins, and side effects. Analytical methods for KGs rely on quantifying similarities between entities, such as nodes, in the graph. However, such methods must consider the diversity of node and edge types contained within the KG via, for example, defined sequences of entity types kn  ...[more]

Similar Datasets

| S-EPMC2796334 | biostudies-literature
2023-08-18 | PXD042954 | Pride
| S-EPMC5146994 | biostudies-literature
| S-EPMC2708159 | biostudies-literature
| S-EPMC10417344 | biostudies-literature
| S-EPMC2850364 | biostudies-literature
| S-EPMC8296689 | biostudies-literature
| S-EPMC7910781 | biostudies-literature
| S-EPMC7304720 | biostudies-literature
2022-02-20 | PXD018905 | Pride