Unknown

Dataset Information

0

Zol & fai: large-scale targeted detection and evolutionary investigation of gene clusters.


ABSTRACT: Many universally and conditionally important genes are genomically aggregated within clusters. Here, we introduce fai and zol, which together enable large-scale comparative analysis of different types of gene clusters and mobile-genetic elements (MGEs), such as biosynthetic gene clusters (BGCs) or viruses. Fundamentally, they overcome a current bottleneck to reliably perform comprehensive orthology inference at large scale across broad taxonomic contexts and thousands of genomes. First, fai allows the identification of orthologous or homologous instances of a query gene cluster of interest amongst a database of target genomes. Subsequently, zol enables reliable, context-specific inference of protein-encoding ortholog groups for individual genes across gene cluster instances. In addition, zol performs functional annotation and computes a variety of statistics for each inferred ortholog group. These programs are showcased through application to: (i) longitudinal tracking of a virus in metagenomes, (ii) discovering novel population-genetic insights of two common BGCs in a fungal species, and (iii) uncovering large-scale evolutionary trends of a virulence-associated gene cluster across thousands of genomes from a diverse bacterial genus.

SUBMITTER: Salamzade R 

PROVIDER: S-EPMC10274777 | biostudies-literature | 2023 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

zol & fai: large-scale targeted detection and evolutionary investigation of gene clusters.

Salamzade Rauf R   Tran Patricia Q PQ   Martin Cody C   Manson Abigail L AL   Gilmore Michael S MS   Earl Ashlee M AM   Anantharaman Karthik K   Kalan Lindsay R LR  

bioRxiv : the preprint server for biology 20240912


Many universally and conditionally important genes are genomically aggregated within clusters. Here, we introduce fai and zol, which together enable large-scale comparative analysis of different types of gene clusters and mobile-genetic elements (MGEs), such as biosynthetic gene clusters (BGCs) or viruses. Fundamentally, they overcome a current bottleneck to reliably perform comprehensive orthology inference at large scale across broad taxonomic contexts and thousands of genomes. First, fai allo  ...[more]

Similar Datasets

| S-EPMC11795205 | biostudies-literature
| S-EPMC5709505 | biostudies-literature
| S-EPMC3040130 | biostudies-literature
| S-EPMC11293555 | biostudies-literature
| S-EPMC11571958 | biostudies-literature
2020-11-18 | GSE156074 | GEO
| S-EPMC2682585 | biostudies-literature
| S-EPMC3632110 | biostudies-literature
| S-EPMC8043073 | biostudies-literature
| S-EPMC4569707 | biostudies-literature