Unknown

Dataset Information

0

PathFams: statistical detection of pathogen-associated protein domains.


ABSTRACT:

Background

A substantial fraction of genes identified within bacterial genomes encode proteins of unknown function. Identifying which of these proteins represent potential virulence factors, and mapping their key virulence determinants, is a challenging but important goal.

Results

To facilitate virulence factor discovery, we performed a comprehensive analysis of 17,929 protein domain families within the Pfam database, and scored them based on their overrepresentation in pathogenic versus non-pathogenic species, taxonomic distribution, relative abundance in metagenomic datasets, and other factors.

Conclusions

We identify pathogen-associated domain families, candidate virulence factors in the human gut, and eukaryotic-like mimicry domains with likely roles in virulence. Furthermore, we provide an interactive database called PathFams to allow users to explore pathogen-associated domains as well as identify pathogen-associated domains and domain architectures in user-uploaded sequences of interest. PathFams is freely available at https://pathfams.uwaterloo.ca .

SUBMITTER: Lobb B 

PROVIDER: S-EPMC8442362 | biostudies-literature | 2021 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

PathFams: statistical detection of pathogen-associated protein domains.

Lobb Briallen B   Tremblay Benjamin Jean-Marie BJ   Moreno-Hagelsieb Gabriel G   Doxey Andrew C AC  

BMC genomics 20210914 1


<h4>Background</h4>A substantial fraction of genes identified within bacterial genomes encode proteins of unknown function. Identifying which of these proteins represent potential virulence factors, and mapping their key virulence determinants, is a challenging but important goal.<h4>Results</h4>To facilitate virulence factor discovery, we performed a comprehensive analysis of 17,929 protein domain families within the Pfam database, and scored them based on their overrepresentation in pathogenic  ...[more]

Similar Datasets

| S-EPMC9935623 | biostudies-literature
| S-EPMC3158042 | biostudies-literature
| S-EPMC2843237 | biostudies-literature
| S-EPMC5936065 | biostudies-literature
| S-EPMC2728341 | biostudies-other
2015-12-31 | GSE69508 | GEO
| S-EPMC2838846 | biostudies-literature
| S-EPMC554113 | biostudies-literature
| S-EPMC5906580 | biostudies-literature
| S-EPMC7831269 | biostudies-literature