<HashMap><database>biostudies-literature</database><scores/><additional><submitter>Chang D</submitter><funding>NCATS NIH HHS</funding><funding>NLM NIH HHS</funding><pagination>167-176</pagination><full_dataset_link>https://www.ebi.ac.uk/biostudies/studies/S-EPMC7971091</full_dataset_link><repository>biostudies-literature</repository><omics_type>Unknown</omics_type><volume>2020</volume><pubmed_abstract>Much of biomedical and healthcare data is encoded in discrete, symbolic form such as text and medical codes. There is a wealth of expert-curated biomedical domain knowledge stored in knowledge bases and ontologies, but the lack of reliable methods for learning knowledge representation has limited their usefulness in machine learning applications. While text-based representation learning has significantly improved in recent years through advances in natural language processing, attempts to learn biomedical concept embeddings so far have been lacking. A recent family of models called knowledge graph embeddings have shown promising results on general domain knowledge graphs, and we explore their capabilities in the biomedical domain. We train several state-of-the-art knowledge graph embedding models on the SNOMED-CT knowledge graph, provide a benchmark with comparison to existing methods and in-depth discussion on best practices, and make a case for the importance of leveraging the multi-relational nature of knowledge graphs for learning biomedical knowledge representation. The embeddings, code, and materials will be made available to the community.</pubmed_abstract><journal>Proceedings of the conference. Association for Computational Linguistics. Meeting</journal><pubmed_title>Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings.</pubmed_title><pmcid>PMC7971091</pmcid><funding_grant_id>T15 LM007056</funding_grant_id><funding_grant_id>UL1 TR001863</funding_grant_id><pubmed_authors>Balazevic I</pubmed_authors><pubmed_authors>Brandt C</pubmed_authors><pubmed_authors>Chawla D</pubmed_authors><pubmed_authors>Taylor RA</pubmed_authors><pubmed_authors>Chang D</pubmed_authors><pubmed_authors>Allen C</pubmed_authors></additional><is_claimable>false</is_claimable><name>Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings.</name><description>Much of biomedical and healthcare data is encoded in discrete, symbolic form such as text and medical codes. There is a wealth of expert-curated biomedical domain knowledge stored in knowledge bases and ontologies, but the lack of reliable methods for learning knowledge representation has limited their usefulness in machine learning applications. While text-based representation learning has significantly improved in recent years through advances in natural language processing, attempts to learn biomedical concept embeddings so far have been lacking. A recent family of models called knowledge graph embeddings have shown promising results on general domain knowledge graphs, and we explore their capabilities in the biomedical domain. We train several state-of-the-art knowledge graph embedding models on the SNOMED-CT knowledge graph, provide a benchmark with comparison to existing methods and in-depth discussion on best practices, and make a case for the importance of leveraging the multi-relational nature of knowledge graphs for learning biomedical knowledge representation. The embeddings, code, and materials will be made available to the community.</description><dates><release>2020-01-01T00:00:00Z</release><publication>2020 Jul</publication><modification>2025-04-04T12:32:08.687Z</modification><creation>2025-04-04T12:32:08.687Z</creation></dates><accession>S-EPMC7971091</accession><cross_references><pubmed>33746351</pubmed><doi>10.18653/v1/2020.bionlp-1.18</doi></cross_references></HashMap>