Project description:Artemisia argyi, as famous as Artemisia annua, is a medicinal plant with huge economic value in the genus of Artemisia and has been widely used in the world for about 3000 years. However, a lack of the reference genome severely hinders the understanding of genetic basis for the active ingredient synthesis of A. argyi. Here, we firstly report a complex chromosome-level genome assembly of A. argyi with a large size of 8.03 Gb, with features of high heterozygosity (2.36%), high repetitive sequences (73.59%) and a huge number of protein-coding genes (279 294 in total). The assembly reveals at least three rounds of whole-genome duplication (WGD) events, including a recent WGD event in the A. argyi genome, and a recent burst of transposable element, which may contribute to its large genome size. The genomic data and karyotype analyses confirmed that A. argyi is an allotetraploid with 34 chromosomes. Intragenome synteny analysis revealed that chromosomes fusion event occurred in the A. argyi genome, which elucidates the changes in basic chromosome numbers in Artemisia genus. Significant expansion of genes related to photosynthesis, DNA replication, stress responses and secondary metabolism were identified in A. argyi, explaining the extensive environmental adaptability and rapid growth characteristics. In addition, we analysed genes involved in the biosynthesis pathways of flavonoids and terpenoids, and found that extensive gene amplification and tandem duplication contributed to the high contents of metabolites in A. argyi. Overall, the reference genome assembly provides scientific support for evolutionary biology, functional genomics and breeding in A. argyi and other Artemisia species.
Project description:Artemisia argyi, called wormwood, is widely distributed in northeastern Asia. The complete chloroplast genome sequence of A. argyi was generated by de novo assembly using whole genome next generation sequences. The complete chloroplast genome sequence of A. argyi is 151 192 bp in size. It is composed of a large single-copy (LSC), a small single-copy (SSC) and two inverted repeat (IR) regions of 82 930 bp, 18 344 bp and 24 959 bp, respectively. Overall GC contents of the genome were 37.46%. The A. argyi chloroplast genome has a total of 114 genes including 80 protein-coding genes, 30 tRNA genes and four rRNA genes. Phylogenetic analysis based on the chloroplast genome demonstrated that A. argyi is most closely related to Artemisia montana.
Project description:Artemisia argyi Levl. et Van is an important Asteraceae species with a high medicinal value. There are abundant A. argyi germplasm resources in Asia, especially in China, but the evolutionary relationships of these varieties and the systematic localization of A. argyi in the family Asteraceae are still unclear. In this study, the chloroplast (cp) genomes of 72 A. argyi varieties were systematically analyzed. The 72 varieties originated from 47 regions in China at different longitudes, latitudes and altitudes, and included both wild and cultivated varieties. The A. argyi cp genome was found to be ∼151 kb in size and to contain 114 genes, including 82 protein-coding, 28 tRNA, and 4 rRNA genes. The number of short sequence repeats (SSRs) in A. argyi cp genomes ranged from 35 to 42, and most of them were mononucleotide A/T repeats. A total of 196 polymorphic sites were detected in the cp genomes of the 72 varieties. Phylogenetic analysis demonstrated that the genetic relationship between A. argyi varieties had a weak relationship with their geographical distribution. Furthermore, inverted repeat (IR) boundaries of 10 Artemisia species were found to be significantly different. A sequence divergence analysis of Asteraceae cp genomes showed that the variable regions were mostly located in single-copy (SC) regions and that the coding regions were more conserved than the non-coding regions. A phylogenetic tree was constructed using 43 protein-coding genes common to 67 Asteraceae species. The resulting tree was consistent with the traditional classification system; Artemisia species were clustered into one group, and A. argyi was shown to be closely related to Artemisia lactiflora and Artemisia montana. In summary, this study systematically analyzed the cp genome characteristics of A. argyi and compared cp genomes of Asteraceae species. The results provide valuable information for the definitive identification of A. argyi varieties and for the understanding of the evolutionary relationships between Asteraceae species.
Project description:Camellia oil extracted from Camellia seeds is rich in unsaturated fatty acids (UFAs) and secondary metabolites beneficial to human health. However, no oil-tea tree genome has yet been published, which is a major obstacle to investigating the heredity improvement of oil-tea trees. Here, using both Illumina and PicBio sequencing technologies, we present the first chromosome-level genome sequence of the oil-tea tree species Camellia chekiangoleosa Hu. (CCH). The assembled genome consists of 15 pseudochromosomes with a genome size of 2.73 Gb and a scaffold N50 of 185.30 Mb. At least 2.16 Gb of the genome assembly consists of repetitive sequences, and the rest involves a high-confidence set of 64 608 protein-coding gene models. Comparative genomic analysis revealed that the CCH genome underwent a whole-genome duplication (WGD) event shared across the Camellia genus at ~57.48 MYA and a γ-WGT event shared across all core eudicot plants at ~120 MYA. Gene family clustering revealed that the genes involved in terpenoid biosynthesis have undergone rapid expansion. Furthermore, we determined the expression patterns of oleic acid accumulation- and terpenoid biosynthesis-associated genes in six tissues. We found that these genes tend to be highly expressed in leaves, pericarp tissues, roots, and seeds. The first chromosome-level genome of oil-tea trees will provide valuable resources for determining Camellia evolution and utilizing the germplasm of this taxon.
Project description:Artemisiae argyi Folium is a traditional herbal medicine used for moxibustion heat therapy in China. The volatile oils in A.argyi leaves are closely related to its medicinal value. Records suggest that the levels of these terpenoids components within the leaves vary as a function of harvest time, with June being the optimal time for A. argyi harvesting, owing to the high levels of active ingredients during this month. However, the molecular mechanisms governing terpenoid biosynthesis and the time-dependent changes in this activity remain unclear. In this study, GC-MS analysis revealed that volatile oil levels varied across four different harvest months (April, May, June, and July) in A. argyi leaves, and the primarily terpenoids components (including both monoterpenes and sesquiterpenes) reached peak levels in early June. Through single-molecule real-time (SMRT) sequencing, corrected by Illumina RNA-sequencing (RNA-Seq), 44 full-length transcripts potentially involved in terpenoid biosynthesis were identified in this study. Differentially expressed genes (DEGs) exhibiting time-dependent expression patterns were divided into 12 coexpression clusters. Integrated chemical and transcriptomic analyses revealed distinct time-specific transcriptomic patterns associated with terpenoid biosynthesis. Subsequent hierarchical clustering and correlation analyses ultimately identified six transcripts that were closely linked to the production of these two types of terpenoid within A. argyi leaves, revealing that the structural diversity of terpenoid is related to the generation of the diverse terpene skeletons by prenyltransferase (TPS) family of enzymes. These findings can guide further studies of the molecular mechanisms underlying the quality of A. argyi leaves, aiding in the selection of optimal timing for harvests of A. argyi.
Project description:Macadamia is an evergreen tree belonging to the Proteaceae family. The two commercial macadamia species, Macadamia integrifolia and M. tetraphylla, are highly prized for their edible kernels. The M. integrifolia genome was recently sequenced, but the genome of M. tetraphylla has to date not been published, which limits the study of biological research and breeding in this species. This study reports a high-quality genome sequence of M. tetraphylla based on the Oxford Nanopore Technologies technology and high-throughput chromosome conformation capture techniques (Hi-C). An assembly of 750.87 Mb with 51.11 Mb N50 length was generated, close to the 740 and 758 Mb size estimates by flow cytometry and k-mer analysis, respectively. Genome annotation indicated that 61.42% of the genome is composed of repetitive sequences and 34.95% is composed of long terminal repeat retrotransposons. Up to 31,571 protein-coding genes were predicted, of which 92.59% were functionally annotated. The average gene length was 6,055 bp. Comparative genome analysis revealed that the gene families associated with defense response, lipid transport, steroid biosynthesis, triglyceride lipase activity, and fatty acid metabolism are expanded in the M. tetraphylla genome. The distribution of fourfold synonymous third-codon transversion showed a recent whole-genome duplication event in M. tetraphylla. Genomic and transcriptomic analysis identified 187 genes encoding 33 crucial oil biosynthesis enzymes, depicting a comprehensive map of macadamia lipid biosynthesis. Besides, the 55 identified WRKY genes exhibited preferential expression in root as compared to that in other tissues. The genome sequence of M. tetraphylla provides novel insights for breeding novel varieties and genetic improvement of agronomic traits.
Project description:Coptis chinensis Franch, a perennial herb, is mainly distributed in southeastern China. The rhizome of C. chinensis has been used as a traditional medicine for more than 2000 years in China and many other Asian countries. The pharmacological activities of C. chinensis have been validated by research. Here, we present a de novo high-quality genome of C. chinensis with a chromosome-level genome of ~958.20 Mb, a contig N50 of 1.58 Mb, and a scaffold N50 of 4.53 Mb. We found that the relatively large genome size of C. chinensis was caused by the amplification of long terminal repeat (LTR) retrotransposons. In addition, a whole-genome duplication event in ancestral Ranunculales was discovered. Comparative genomic analysis revealed that the tyrosine decarboxylase (TYDC) and (S)-norcoclaurine synthase (NCS) genes were expanded and that the aspartate aminotransferase gene (ASP5) was positively selected in the berberine metabolic pathway. Expression level and HPLC analyses showed that the berberine content was highest in the roots of C. chinensis in the third and fourth years. The chromosome-level reference genome of C. chinensis provides important genomic data for molecular-assisted breeding and active ingredient biosynthesis.