Unknown

Dataset Information

0

TmVar 3.0: an improved variant concept recognition and normalization tool.


ABSTRACT:

Motivation

Previous studies have shown that automated text-mining tools are becoming increasingly important for successfully unlocking variant information in scientific literature at large scale. Despite multiple attempts in the past, existing tools are still of limited recognition scope and precision.

Result

We propose tmVar 3.0: an improved variant recognition and normalization system. Compared to its predecessors, tmVar 3.0 recognizes a wider spectrum of variant-related entities (e.g. allele and copy number variants), and groups together different variant mentions belonging to the same genomic sequence position in an article for improved accuracy. Moreover, tmVar 3.0 provides advanced variant normalization options such as allele-specific identifiers from the ClinGen Allele Registry. tmVar 3.0 exhibits state-of-the-art performance with over 90% in F-measure for variant recognition and normalization, when evaluated on three independent benchmarking datasets. tmVar 3.0 as well as annotations for the entire PubMed and PMC datasets are freely available for download.

Availability and implementation

https://github.com/ncbi/tmVar3.

SUBMITTER: Wei CH 

PROVIDER: S-EPMC9477515 | biostudies-literature | 2022 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

tmVar 3.0: an improved variant concept recognition and normalization tool.

Wei Chih-Hsuan CH   Allot Alexis A   Riehle Kevin K   Milosavljevic Aleksandar A   Lu Zhiyong Z  

Bioinformatics (Oxford, England) 20220901 18


<h4>Motivation</h4>Previous studies have shown that automated text-mining tools are becoming increasingly important for successfully unlocking variant information in scientific literature at large scale. Despite multiple attempts in the past, existing tools are still of limited recognition scope and precision.<h4>Result</h4>We propose tmVar 3.0: an improved variant recognition and normalization system. Compared to its predecessors, tmVar 3.0 recognizes a wider spectrum of variant-related entitie  ...[more]

Similar Datasets

| S-EPMC3951655 | biostudies-literature
| S-EPMC9563680 | biostudies-literature
| S-EPMC5730334 | biostudies-literature
| S-EPMC8982815 | biostudies-literature
| S-EPMC10710372 | biostudies-literature
| S-EPMC7415240 | biostudies-literature
| S-EPMC7805810 | biostudies-literature
| S-EPMC4448883 | biostudies-literature
| S-EPMC9738852 | biostudies-literature
| S-EPMC6535044 | biostudies-literature