Project description:Macaque species share over 93% genome homology with humans and develop many disease phenotypes similar to those of humans, making them valuable animal models for the study of human diseases (e.g.,HIV and neurodegenerative diseases). However, the quality of genome assembly and annotation for several macaque species lags behind the human genome effort. To close this gap and enhance functional genomics approaches, we employed a combination of de novo linked-read assembly and scaffolding using proximity ligation assay (HiC) to assemble the pig-tailed macaque (Macaca nemestrina) genome. This combinatorial method yielded large scaffolds at chromosome-level with a scaffold N50 of 127.5 Mb; the 23 largest scaffolds covered 90% of the entire genome. This assembly revealed large-scale rearrangements between pig-tailed macaque chromosomes 7, 12, and 13 and human chromosomes 2, 14, and 15. We subsequently annotated the genome using transcriptome and proteomics data from personalized induced pluripotent stem cells (iPSCs) derived from the same animal. Reconstruction of the evolutionary tree using whole genome annotation and orthologous comparisons among three macaque species, human and mouse genomes revealed extensive homology between human and pig-tailed macaques with regards to both pluripotent stem cell genes and innate immune gene pathways. Our results confirm that rhesus and cynomolgus macaques exhibit a closer evolutionary distance to each other than either species exhibits to humans or pig-tailed macaques. These findings demonstrate that pig-tailed macaques can serve as an excellent animal model for the study of many human diseases particularly with regards to pluripotency and innate immune pathways.
Project description:The ratmouth barbel (Ptychidio jordani) is a critically endangered freshwater fish from the Cyprinidae family, primarily due to overfishing and habitat disruption. To address the challenges of its shrinking wild populations and the difficulties in artificial reproduction, we sequenced, assembled, and annotated a high-quality chromosome-level genome of P. jordani using next-generation short-read sequencing, third-generation long-read sequencing, and Hi-C sequencing. The final genome assembly was 1.14 Gb, consisting of 25 chromosomes with a contig N50 of 25.14 Mb and a scaffold N50 of 42.91 Mb. We identified 25,183 protein-coding genes, 751.75 Mb of repeats, and 19,373 ncRNAs. Methylation loci on most chromosomes ranged from 1,000 to 3,000 per 100 kb window. Gene expression levels across various tissues were analyzed, revealing 12,135 (caudal fin), 11,465 (liver), 14,438 (gill), 12,413 (heart), 8,301 (spleen), and 3,578 (kidney) differentially expressed genes compared to muscle. The comprehensive genomic and transcriptomic resources generated here will aid in understanding the ecology, adaptation, and environmental responses of P. jordani, supporting future research and conservation efforts.
Project description:Chlorops oryzae is a pest of rice that has caused severe damage to crops in major rice-growing areas in recent years. We generated a 447.60 Mb high-quality chromosome-level genome with contig and scaffold N50 values of 1.17 Mb and 117.57 Mb, respectively. Hi-C analysis anchored 93.22% scaffolds to 4 chromosomes. The relatively high expression level of Heat Shock Proteins (HSPs) and antioxidant genes in response to thermal stress suggests these genes may play a role in the environmental adaptability of C. oryzae. The identification of multiple pathways that regulate reproductive development (juvenile hormone, 20-hydroxyecdsone, and insulin signaling pathways) provides evidence that these pathways also play an important role in vitellogenesis and thus insect population maintenance. These findings identify possible reasons for the increased frequency of outbreaks of C. oryzae in recent years. Our chromosome-level genome assembly may provide a basis for further genetic studies of C. oryzae, and promote the development of novel, sustainable strategies to control this pest.
Project description:The ratmouth barbel (Ptychidio jordani) is a critically endangered freshwater fish from the Cyprinidae family, primarily due to overfishing and habitat disruption. To address the challenges of its shrinking wild populations and the difficulties in artificial reproduction, we sequenced, assembled, and annotated a high-quality chromosome-level genome of P. jordani using next-generation short-read sequencing, third-generation long-read sequencing, and Hi-C sequencing. The final genome assembly was 1.14 Gb, consisting of 25 chromosomes with a contig N50 of 25.14 Mb and a scaffold N50 of 42.91 Mb. We identified 25,183 protein-coding genes, 751.75 Mb of repeats, and 19,373 ncRNAs. Methylation loci on most chromosomes ranged from 1,000 to 3,000 per 100 kb window. Gene expression levels across various tissues were analyzed, revealing 12,135 (caudal fin), 11,465 (liver), 14,438 (gill), 12,413 (heart), 8,301 (spleen), and 3,578 (kidney) differentially expressed genes compared to muscle. The comprehensive genomic and transcriptomic resources generated here will aid in understanding the ecology, adaptation, and environmental responses of P. jordani, supporting future research and conservation efforts.
Project description:<p><strong>BACKGROUND:</strong> Plants exhibit wide chemical diversity due to the production of specialized metabolites that function as pollinator attractants, defensive compounds, and signaling molecules. Lamiaceae (mints) are known for their chemodiversity and have been cultivated for use as culinary herbs, as well as sources of insect repellents, health-promoting compounds, and fragrance.</p><p><strong>FINDINGS:</strong> We report the chromosome-scale genome assembly of Callicarpa americana L. (American beautyberry), a species within the early-diverging Callicarpoideae clade of Lamiaceae, known for its metallic purple fruits and use as an insect repellent due to its production of terpenoids. Using long-read sequencing and Hi-C scaffolding, we generated a 506.1-Mb assembly spanning 17 pseudomolecules with N50 contig and N50 scaffold sizes of 7.5 and 29.0 Mb, respectively. In all, 32,164 genes were annotated, including 53 candidate terpene synthases and 47 putative clusters of specialized metabolite biosynthetic pathways. Our analyses revealed 3 putative whole-genome duplication events, which, together with local tandem duplications, contributed to gene family expansion of terpene synthases. Kolavenyl diphosphate is a gateway to many of the bioactive terpenoids in C. americana; experimental validation confirmed that CamTPS2 encodes kolavenyl diphosphate synthase. Syntenic analyses with Tectona grandis L. f. (teak), a member of the Tectonoideae clade of Lamiaceae known for exceptionally strong wood resistant to insects, revealed 963 collinear blocks and 21,297 C. americana syntelogs.</p><p><strong>CONCLUSIONS:</strong> Access to the C. americana genome provides a road map for rapid discovery of genes encoding plant-derived agrichemicals and a key resource for understanding the evolution of chemical diversity in Lamiaceae.</p>