Project description:The Earth Biogenome Project has rapidly increased the number of available eukaryotic genomes, but most released genomes continue to lack annotation of protein-coding genes. In addition, no transcriptome data is available for some genomes. Various gene annotation tools have been developed but each has its limitations. Here, we introduce GALBA, a fully automated pipeline that utilizes miniprot, a rapid protein- to-genome aligner, in combination with AUGUSTUS to predict genes with high accuracy. Accuracy results indicate that GALBA is particularly strong in the annotation of large vertebrate genomes. We also present use cases in insects, vertebrates, and a previously unannotated land plant. GALBA is fully open source and available as a docker image for easy execution with Singularity in high-performance computing environments. Our pipeline addresses the critical need for accurate gene annotation in newly sequenced genomes, and we believe that GALBA will greatly facilitate genome annotation for diverse organisms.
Project description:BackgroundThe Earth Biogenome Project has rapidly increased the number of available eukaryotic genomes, but most released genomes continue to lack annotation of protein-coding genes. In addition, no transcriptome data is available for some genomes.ResultsVarious gene annotation tools have been developed but each has its limitations. Here, we introduce GALBA, a fully automated pipeline that utilizes miniprot, a rapid protein-to-genome aligner, in combination with AUGUSTUS to predict genes with high accuracy. Accuracy results indicate that GALBA is particularly strong in the annotation of large vertebrate genomes. We also present use cases in insects, vertebrates, and a land plant. GALBA is fully open source and available as a docker image for easy execution with Singularity in high-performance computing environments.ConclusionsOur pipeline addresses the critical need for accurate gene annotation in newly sequenced genomes, and we believe that GALBA will greatly facilitate genome annotation for diverse organisms.
Project description:Two novel DNA base surrogate phosphoramidites 1 and 2, based upon relatively electron-rich 1,5-dialkoxynaphthalene (DAN) and relatively electron-deficient 1,4,5,8-naphthalenetetracarboxylic diimide (NDI), respectively, were designed, synthesized, and incorporated into DNA oligonucleotide strands. The DAN and NDI artificial DNA bases were inserted within a three-base-pair region within the interior of a 12-mer oligonucleotide duplex in various sequential arrangements and investigated with CD spectroscopy and UV melting curve analysis. The CD spectra of the modified duplexes indicated B-form DNA topology. Melting curve analyses revealed trends in DNA duplex stability that correlate with the known association of DAN and NDI moieties in aqueous solution as well as the known favorable interactions between NDI and natural DNA base pairs. This demonstrates that DNA duplex stability and specificity can be driven by the electrostatic complementarity between DAN and NDI. In the most favorable case, an NDI-DAN-NDI arrangement in the middle of the DNA duplex was found to be approximately as stabilizing as three A-T base pairs.