Project description:<p>The section <em>Oleifera</em> (Theaceae) has attracted attention for the high levels of unsaturated fatty acids found in its seeds. Here, we report the chromosome-scale genome of the sect. <em>Oleifera</em> using diploid wild <em>Camellia lanceoleosa</em> with a final size of 3.00 Gb and an N50 scaffold size of 186.43 Mb. Repetitive sequences accounted for 80.63% and were distributed unevenly across the genome. <em>Camellia lanceoleosa</em> underwent a whole-genome duplication event approximately 65 million years ago (65 Mya), prior to the divergence of <em>C</em>. <em>lanceoleosa</em> and <em>Camellia sinensis</em> (approx. 6-7 Mya). Syntenic comparisons of these two species elucidated the genomic rearrangement, appearing to be driven in part by the activity of transposable elements. The expanded and positively selected genes in <em>C</em>. <em>lanceoleosa</em> were significantly enriched in oil biosynthesis, and the expansion of homomeric <em>acetyl-coenzyme A carboxylase</em> (<em>ACCase</em>) genes and the seed-biased expression of genes encoding heteromeric ACCase, diacylglycerol acyltransferase, glyceraldehyde-3-phosphate dehydrogenase and stearoyl-ACP desaturase could be of primary importance for the high oil and oleic acid content found in <em>C. lanceoleosa</em>. Theanine and catechins were present in the leaves of <em>C</em>. <em>lanceoleosa</em>. However, caffeine can not be dectected in the leaves but was abundant in the seeds and roots. The functional and transcriptional divergence of genes encoding SAM-dependent <em>N</em>-methyltransferases may be associated with caffeine accumulation and distribution. Gene expression profiles, structural composition and chromosomal location suggest that the late-acting self-incompatibility of <em>C. lanceoleosa</em> is likely to have favoured a novel mechanism co-occurring with gametophytic self-incompatibility. This study provides valuable resources for quantitative and qualitative improvements and genome assembly of polyploid plants in sect. <em>Oleifera</em>.</p>
Project description:<p><strong>BACKGROUND:</strong> Plants exhibit wide chemical diversity due to the production of specialized metabolites that function as pollinator attractants, defensive compounds, and signaling molecules. Lamiaceae (mints) are known for their chemodiversity and have been cultivated for use as culinary herbs, as well as sources of insect repellents, health-promoting compounds, and fragrance.</p><p><strong>FINDINGS:</strong> We report the chromosome-scale genome assembly of Callicarpa americana L. (American beautyberry), a species within the early-diverging Callicarpoideae clade of Lamiaceae, known for its metallic purple fruits and use as an insect repellent due to its production of terpenoids. Using long-read sequencing and Hi-C scaffolding, we generated a 506.1-Mb assembly spanning 17 pseudomolecules with N50 contig and N50 scaffold sizes of 7.5 and 29.0 Mb, respectively. In all, 32,164 genes were annotated, including 53 candidate terpene synthases and 47 putative clusters of specialized metabolite biosynthetic pathways. Our analyses revealed 3 putative whole-genome duplication events, which, together with local tandem duplications, contributed to gene family expansion of terpene synthases. Kolavenyl diphosphate is a gateway to many of the bioactive terpenoids in C. americana; experimental validation confirmed that CamTPS2 encodes kolavenyl diphosphate synthase. Syntenic analyses with Tectona grandis L. f. (teak), a member of the Tectonoideae clade of Lamiaceae known for exceptionally strong wood resistant to insects, revealed 963 collinear blocks and 21,297 C. americana syntelogs.</p><p><strong>CONCLUSIONS:</strong> Access to the C. americana genome provides a road map for rapid discovery of genes encoding plant-derived agrichemicals and a key resource for understanding the evolution of chemical diversity in Lamiaceae.</p>