{"database":"biostudies-literature","file_versions":[],"scores":null,"additional":{"submitter":["Chang D"],"funding":["NCATS NIH HHS","NLM NIH HHS"],"pagination":["167-176"],"full_dataset_link":["https://www.ebi.ac.uk/biostudies/studies/S-EPMC7971091"],"repository":["biostudies-literature"],"omics_type":["Unknown"],"volume":["2020"],"pubmed_abstract":["Much of biomedical and healthcare data is encoded in discrete, symbolic form such as text and medical codes. There is a wealth of expert-curated biomedical domain knowledge stored in knowledge bases and ontologies, but the lack of reliable methods for learning knowledge representation has limited their usefulness in machine learning applications. While text-based representation learning has significantly improved in recent years through advances in natural language processing, attempts to learn biomedical concept embeddings so far have been lacking. A recent family of models called knowledge graph embeddings have shown promising results on general domain knowledge graphs, and we explore their capabilities in the biomedical domain. We train several state-of-the-art knowledge graph embedding models on the SNOMED-CT knowledge graph, provide a benchmark with comparison to existing methods and in-depth discussion on best practices, and make a case for the importance of leveraging the multi-relational nature of knowledge graphs for learning biomedical knowledge representation. The embeddings, code, and materials will be made available to the community."],"journal":["Proceedings of the conference. Association for Computational Linguistics. Meeting"],"pubmed_title":["Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings."],"pmcid":["PMC7971091"],"funding_grant_id":["T15 LM007056","UL1 TR001863"],"pubmed_authors":["Balazevic I","Brandt C","Chawla D","Taylor RA","Chang D","Allen C"],"additional_accession":[]},"is_claimable":false,"name":"Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings.","description":"Much of biomedical and healthcare data is encoded in discrete, symbolic form such as text and medical codes. There is a wealth of expert-curated biomedical domain knowledge stored in knowledge bases and ontologies, but the lack of reliable methods for learning knowledge representation has limited their usefulness in machine learning applications. While text-based representation learning has significantly improved in recent years through advances in natural language processing, attempts to learn biomedical concept embeddings so far have been lacking. A recent family of models called knowledge graph embeddings have shown promising results on general domain knowledge graphs, and we explore their capabilities in the biomedical domain. We train several state-of-the-art knowledge graph embedding models on the SNOMED-CT knowledge graph, provide a benchmark with comparison to existing methods and in-depth discussion on best practices, and make a case for the importance of leveraging the multi-relational nature of knowledge graphs for learning biomedical knowledge representation. The embeddings, code, and materials will be made available to the community.","dates":{"release":"2020-01-01T00:00:00Z","publication":"2020 Jul","modification":"2025-04-04T12:32:08.687Z","creation":"2025-04-04T12:32:08.687Z"},"accession":"S-EPMC7971091","cross_references":{"pubmed":["33746351"],"doi":["10.18653/v1/2020.bionlp-1.18"]}}