Ontology highlight
ABSTRACT: Background
Hetnets, short for "heterogeneous networks," contain multiple node and relationship types and offer a way to encode biomedical knowledge. One such example, Hetionet, connects 11 types of nodes-including genes, diseases, drugs, pathways, and anatomical structures-with over 2 million edges of 24 types. Previous work has demonstrated that supervised machine learning methods applied to such networks can identify drug repurposing opportunities. However, a training set of known relationships does not exist for many types of node pairs, even when it would be useful to examine how nodes of those types are meaningfully connected. For example, users may be curious about not only how metformin is related to breast cancer but also how a given gene might be involved in insomnia.Findings
We developed a new procedure, termed hetnet connectivity search, that proposes important paths between any 2 nodes without requiring a supervised gold standard. The algorithm behind connectivity search identifies types of paths that occur more frequently than would be expected by chance (based on node degree alone). Several optimizations were required to precompute significant instances of node connectivity at the scale of large knowledge graphs.Conclusion
We implemented the method on Hetionet and provide an online interface at https://het.io/search. We provide an open-source implementation of these methods in our new Python package named hetmatpy.
SUBMITTER: Himmelstein DS
PROVIDER: S-EPMC10375517 | biostudies-literature | 2022 Dec
REPOSITORIES: biostudies-literature
Himmelstein Daniel S DS Zietz Michael M Rubinetti Vincent V Kloster Kyle K Heil Benjamin J BJ Alquaddoomi Faisal F Hu Dongbo D Nicholson David N DN Hao Yun Y Sullivan Blair D BD Nagle Michael W MW Greene Casey S CS
GigaScience 20221201
<h4>Background</h4>Hetnets, short for "heterogeneous networks," contain multiple node and relationship types and offer a way to encode biomedical knowledge. One such example, Hetionet, connects 11 types of nodes-including genes, diseases, drugs, pathways, and anatomical structures-with over 2 million edges of 24 types. Previous work has demonstrated that supervised machine learning methods applied to such networks can identify drug repurposing opportunities. However, a training set of known rela ...[more]