Unknown

Dataset Information

0

Beyond blast: enabling microbiologists to better extract literature, taxonomic distributions and gene neighbourhood information for protein families.


ABSTRACT: Capturing the published corpus of information on all members of a given protein family should be an essential step in any study focusing on specific members of that family. Using a previously gathered dataset of more than 280 references mentioning a member of the DUF34 (NIF3/Ngg1-interacting Factor 3) family, we evaluated the efficiency of different databases and search tools, and devised a workflow that experimentalists can use to capture the most information published on members of a protein family in the least amount of time. To complement this workflow, web-based platforms allowing for the exploration of protein family members across sequenced genomes or for the analysis of gene neighbourhood information were reviewed for their versatility and ease of use. Recommendations that can be used for experimentalist users, as well as educators, are provided and integrated within a customized, publicly accessible Wiki.

SUBMITTER: Reed CJ 

PROVIDER: S-EPMC10926702 | biostudies-literature | 2024 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Beyond blast: enabling microbiologists to better extract literature, taxonomic distributions and gene neighbourhood information for protein families.

Reed Colbie J CJ   Denise Rémi R   Hourihan Jacob J   Babor Jill J   Jaroch Marshall M   Martinelli Maria M   Hutinet Geoffrey G   de Crécy-Lagard Valérie V  

Microbial genomics 20240201 2


Capturing the published corpus of information on all members of a given protein family should be an essential step in any study focusing on specific members of that family. Using a previously gathered dataset of more than 280 references mentioning a member of the DUF34 (NIF3/Ngg1-interacting Factor 3) family, we evaluated the efficiency of different databases and search tools, and devised a workflow that experimentalists can use to capture the most information published on members of a protein f  ...[more]

Similar Datasets

| S-EPMC7473582 | biostudies-literature
| S-EPMC1538791 | biostudies-literature
| S-EPMC7515426 | biostudies-literature
| S-EPMC4186660 | biostudies-literature
| S-EPMC5969410 | biostudies-literature
| S-EPMC9041710 | biostudies-literature
| S-EPMC10281857 | biostudies-literature
| S-EPMC10117768 | biostudies-literature
| S-EPMC11193052 | biostudies-literature
| S-EPMC9060555 | biostudies-literature