Ontology highlight
ABSTRACT: Summary
Relation extraction (RE) from large text collections is an important tool for database curation, pathway reconstruction, or functional omics data analysis. In practice, RE often is part of a complex data analysis pipeline requiring specific adaptations like restricting the types of relations or the set of proteins to be considered. However, current systems are either non-programmable web sites or research code with fixed functionality. We present PEDL+, a user-friendly tool for extracting protein-protein and protein-chemical associations from PubMed articles. PEDL+ combines state-of-the-art NLP technology with adaptable ranking and filtering options and can easily be integrated into analysis pipelines. We evaluated PEDL+ in two pathway curation projects and found that 59% to 80% of its extractions were helpful.Availability and implementation
PEDL+ is freely available at https://github.com/leonweber/pedl.
SUBMITTER: Weber L
PROVIDER: S-EPMC10660277 | biostudies-literature | 2023 Nov
REPOSITORIES: biostudies-literature
Weber Leon L Barth Fabio F Lorenz Leonie L Konrath Fabian F Huska Kirsten K Wolf Jana J Leser Ulf U
Bioinformatics (Oxford, England) 20231101 11
<h4>Summary</h4>Relation extraction (RE) from large text collections is an important tool for database curation, pathway reconstruction, or functional omics data analysis. In practice, RE often is part of a complex data analysis pipeline requiring specific adaptations like restricting the types of relations or the set of proteins to be considered. However, current systems are either non-programmable web sites or research code with fixed functionality. We present PEDL+, a user-friendly tool for e ...[more]