Nassar2022 - Biosynthetic Gene Cluster (BGC) compound NER model
Ontology highlight
ABSTRACT: bgc-compound model is a Named Entity Recognition (NER) model that identifies and annotates the compounds of Biosynthetic Gene Clusters (BGCs) in texts.
Project description:bgc-accession model is a Named Entity Recognition (NER) model that identifies and annotates the accessions of Biosynthetic Gene Clusters (BGCs) in texts.
Project description:bgc-organism model is a Named Entity Recognition (NER) model that identifies and annotates the organism of Biosynthetic Gene Clusters (BGCs) in texts.
Project description:bgc-action model is a Named Entity Recognition (NER) model that identifies and annotates the actions of Biosynthetic Gene Clusters (BGCs) in texts.
Project description:bgc-class model is a Named Entity Recognition (NER) model that identifies and annotates the chemical class of Biosynthetic Gene Clusters (BGCs) in texts.
Project description:bgc-gene-name model is a Named Entity Recognition (NER) model that identifies and annotates the genes of Biosynthetic Gene Clusters (BGCs) in texts.
Project description:bgc-gene-product model is a Named Entity Recognition (NER) model that identifies and annotates the protein products of Biosynthetic Gene Clusters (BGCs) in texts.
Project description:An improved understanding of the genome-wide regulation of natural compound biosynthesis in bacterial producers may accelerate the discovery of novel biologically active molecules and facilitate their production. To this end, we have investigated the time course of genome-wide transcription in the myxobacterium Sorangium sp. So ce836 in relation to its production of natural compounds. Time-resolved RNA sequencing revealed the dynamic temporal variation of transcriptional activity, indicating that core biosynthesis genes from 48 biosynthetic gene clusters (BGCs; 92% of all BGCs encoded in the genome) were actively transcribed at specific time points in a batch culture. The majority (80%) of polyketide synthase and nonribosomal peptide synthetase genes displayed distinct peaks of transcription during exponential bacterial growth. Strikingly, these surges in BGC transcriptional activity were associated with boosts in the production of known natural compounds, indicating that their biosynthesis was crucially regulated at the transcriptional level. In contrast, BGC read counts from single time points had limited predictive value about biosynthetic activity, since transcription levels varied >100-fold among BGCs with detected natural products. Taken together, our time-course data provided unique insights into the dynamics of natural compound biosynthesis and its regulation in a wild-type myxobacterium, challenging the commonly cited notion of preferential BGC expression under meager conditions for bacterial growth. The close association between BGC transcription and compound production suggested that the molecular manipulation of transcriptional activity may be a viable strategy to increase compound yields from myxobacterial producer strains, warranting increased efforts to develop genetic engineering tools for these organisms.
Project description:Here, we profiled the transcriptional capacity of a library of regulatory sequences mined from diverse Biosynthetic Gene Clusters in S. albidoflavus (S. albus J1074) to investigate BGC gene regulation.
Project description:Streptomyces has the largest repertoire of natural product biosynthetic gene clusters (BGCs), yet developing a universal engineering strategy for each Streptomyces species is challenging. Given that some Streptomyces species have larger BGC repertoires than others, we hypothesized that a set of genes co-evolved with BGCs to support biosynthetic proficiency must exist in those strains, and that their identification may provide universal strategies to improve the productivity of other strains. We show here that genes co-evolved with natural product BGCs in Streptomyces can be identified by phylogenomics analysis. Among the 597 genes that co-evolved with polyketide BGCs, 11 genes in the “coenzyme” category have been examined, including a gene cluster encoding for the co-factor pyrroloquinoline quinone (PQQ). When the pqq gene cluster was engineered into 11 Streptomyces strains, it enhanced production of 16,385 metabolites, including 36 known natural products with up to 40-fold improvement and several activated silent gene clusters. This study provides a new engineering strategy for improving polyketide production and discovering new biosynthetic gene clusters.
Project description:Background: Fungi are important sources for bioactive compounds that find their applications in many important sectors like in the pharma-, food- or agricultural industries. In an environmental monitoring project for fungi involved in soil nitrogen cycling we also isolated Cephalotrichum gorgonifer (strain NG_p51). In the course of strain characterization work we found that this strain is able to produce high amounts of rasfonin, a polyketide inducing autophagy, apoptosis, necroptosis in human cell lines and shows anti-tumor activity in RAS-dependent cancer cells. Results: In order to elucidate the biosynthetic pathway of rasfonin, the strain was genome sequenced, annotated, submitted to transcriptome analysis and genetic transformation was established. Biosynthetic gene cluster (BGC) prediction revealed the existence of 22 BGCs of which the majority was not expressed under our experimental conditions. In silico prediction revealed two BGCs with a suite of enzymes possibly involved in rasfonin biosynthesis. Experimental verification by gene-knock out of the key enzyme genes showed that one of the predicted BGCs is s indeed responsible for rasfonin biosynthesis. Conclusions: The results of this study lay the ground for molecular biology focused research in Cephalotrichum gorgonifer. Furthermore, strain engineering and heterologous expression of the rasfonin BGC is now possible which facilitates the construction of high producing strains or the synthesis of rasfonin derivates for diverse applications.