Project description:MicroRNAs (miRNAs) are involved in post-transcriptional regulation of gene expression. Since several miRNAs are known to affect the stability or translation of developmental regulatory genes, the origin of novel miRNAs may have contributed to the evolution of developmental processes and morphology. Lepidoptera (butterflies and moths) is a species-rich clade with a well-established phylogeny and abundant genomic resources, thereby representing an ideal system in which to study miRNA evolution. We sequenced small RNA libraries from developmental stages of two divergent lepidopterans, Cameraria ohridella (Horse chestnut Leafminer) and Pararge aegeria (Speckled Wood butterfly), discovering 90 and 81 conserved miRNAs respectively, and many species-specific miRNA sequences. Mapping miRNAs onto the lepidopteran phylogeny reveals rapid miRNA turnover and an episode of miRNA fixation early in lepidopteran evolution, implying that miRNA acquisition accompanied the early radiation of the Lepidoptera. One lepidopteran-specific miRNA gene, miR-2768, is located within an intron of the homeobox gene invected, involved in insect segmental and wing patterning. We identified cubitus interruptus (ci) as a likely direct target of miR-2768, and validated this suppression using a luciferase assay system. We propose a model by which miR-2768 modulates expression of ci in the segmentation pathway and in patterning of lepidopteran wing primordia. Examination of the small RNA complements pooled across life cycle stages in each of Cameraria ohridella and Pararge aegeria.
Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.
Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.
Project description:Porcine 60K BeadChip genotyping arrays (Illumina) are increasingly being applied in pig genomics to validate SNPs identified by re-sequencing or assembly-versus-assembly method. Here we report that more than 98% SNPs identified from the porcine 60K BeadChip genotyping array (Illumina) were consistent with the SNPs identified from the assembly-based method. This result demonstrates that whole-genome de novo assembly is a reliable approach to deriving accurate maps of SNPs.