Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.
ABSTRACT: Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.
Project description:BACKGROUND:Diospyros oleifera Cheng, of the family Ebenaceae, is an economically important tree. Phylogenetic analyses indicate that D. oleifera is closely related to Diospyros kaki Thunb. and could be used as a model plant for studies of D. kaki. Therefore, development of genomic resources of D. oleifera will facilitate auxiliary assembly of the hexaploid persimmon genome and elucidate the molecular mechanisms of important traits. FINDINGS:The D. oleifera genome was assembled with 443.6 Gb of raw reads using the Pacific Bioscience Sequel and Illumina HiSeq X Ten platforms. The final draft genome was ?812.3 Mb and had a high level of continuity with N50 of 3.36 Mb. Fifteen scaffolds corresponding to the 15 chromosomes were assembled to a final size of 721.5 Mb using 332 scaffolds, accounting for 88.81% of the genome. Repeat sequences accounted for 54.8% of the genome. By de novo sequencing and analysis of homology with other plant species, 30,530 protein-coding genes with an average transcript size of 7,105.40 bp were annotated; of these, 28,580 protein-coding genes (93.61%) had conserved functional motifs or terms. In addition, 171 candidate genes involved in tannin synthesis and deastringency in persimmon were identified; of these chalcone synthase (CHS) genes were expanded in the D. oleifera genome compared with Diospyros lotus, Camellia sinensis, and Vitis vinifera. Moreover, 186 positively selected genes were identified, including chalcone isomerase (CHI) gene, a key enzyme in the flavonoid-anthocyanin pathway. Phylogenetic tree analysis indicated that the split of D. oleifera and D. lotus likely occurred 9.0 million years ago. In addition to the ancient ? event, a second whole-genome duplication event occurred in D. oleifera and D. lotus. CONCLUSIONS:We generated a high-quality chromosome-level draft genome for D. oleifera, which will facilitate assembly of the hexaploid persimmon genome and further studies of major economic traits in the genus Diospyros.
Project description:Persimmon (<i>Diospyros kaki</i>) is an oriental perennial woody fruit tree whose popular fruit is produced and consumed worldwide. The persimmon fruit is unique because of the hyperaccumulation of proanthocyanidins during fruit development, causing the mature fruit of most cultivars to have an astringent taste. In this study, we obtained a chromosome-scale genome assembly for 'Youshi' (<i>Diospyros oleifera</i>, 2n?=?2x?=?30), the diploid species of persimmon, by integrating Illumina sequencing, single-molecule real-time sequencing, and high-throughput chromosome conformation capture techniques. The assembled <i>D. oleifera</i> genome consisted of 849.53?Mb, 94.14% (799.71?Mb) of which was assigned to 15 pseudochromosomes, and is the first assembled genome for any member of the Ebenaceae. Comparative genomic analysis revealed that the <i>D. oleifera</i> genome underwent an ancient ? whole-genome duplication event. We studied the potential genetic basis for astringency development (proanthocyanidin biosynthesis) and removal (proanthocyanidin insolublization). Proanthocyanidin biosynthesis genes were mainly distributed on chromosome 1, and the clustering of these genes is responsible for the genetic stability of astringency heredity. Genome-based RNA-seq identified deastringency genes, and promoter analysis showed that most of their promoters contained large numbers of low oxygen-responsive motifs, which is consistent with the efficient industrial application of high CO<sub>2</sub> treatment to remove astringency. Using the <i>D. oleifera</i> genome as the reference, SLAF-seq indicated that 'Youshi' is one of the ancestors of the cultivated persimmon (2n?=?6x?=?90). Our study provides significant insights into the genetic basis of persimmon evolution and the development and removal astringency, and it will facilitate the improvement of the breeding of persimmon fruit.
Project description:The family Balsaminaceae, which consists of the economically important genus Impatiens and the monotypic genus Hydrocera, lacks a reported or published complete chloroplast genome sequence. Therefore, chloroplast genome sequences of the two sister genera are significant to give insight into the phylogenetic position and understanding the evolution of the Balsaminaceae family among the Ericales. In this study, complete chloroplast (cp) genomes of Impatiens pinfanensis and Hydrocera triflora were characterized and assembled using a high-throughput sequencing method. The complete cp genomes were found to possess the typical quadripartite structure of land plants chloroplast genomes with double-stranded molecules of 154,189 bp (Impatiens pinfanensis) and 152,238 bp (Hydrocera triflora) in length. A total of 115 unique genes were identified in both genomes, of which 80 are protein-coding genes, 31 are distinct transfer RNA (tRNA) and four distinct ribosomal RNA (rRNA). Thirty codons, of which 29 had A/T ending codons, revealed relative synonymous codon usage values of >1, whereas those with G/C ending codons displayed values of <1. The simple sequence repeats comprise mostly the mononucleotide repeats A/T in all examined cp genomes. Phylogenetic analysis based on 51 common protein-coding genes indicated that the Balsaminaceae family formed a lineage with Ebenaceae together with all the other Ericales.
Project description:BACKGROUND:Fruits of persimmon plants are traditional healthy food in China, Korea and Japan. However, due to the shortage of morphological and DNA markers, the development of persimmon industry has been heavily inhibited. RESULTS:Chloroplast genomes of Diospyros cathayensis, D. virginiana, D. rhombifolia and D. deyangensis were newly sequenced. Comparative analyses of ten chloroplast genomes including six previously published chloroplast genomes of Diospyros provided new insights into the genome sequence diversity and genomic resources of the genus. Eight hyper-variable regions, trnH-psbA, rps16-trnQ, rpoB-trnC, rps4-trnT-trnL, ndhF, ndhF-rpl32-trnL, ycf1a, and ycf1b, were discovered and can be used as chloroplast DNA markers at/above species levels. The complete chloroplast genome sequences provided the best resolution at inter-specific level in comparison with different chloroplast DNA sequence datasets. CONCLUSION:Diospyros oleifera, D. deyangensis, D. virginiana, D. glaucifolia, D. lotus and D. jinzaoshi are important wild species closely related to the cultivated persimmon D. kaki. The hyper-variable regions can be used as DNA markers for global genetic diversity detection of Diospyros. Deeper study on these taxa would be helpful for elucidating the origin of D. kaki.
Project description:Knowledge about genetic diversity and relationships among germplasms could be an invaluable aid in diospyros improvement strategies.This study was designed to analyze the genetic diversity and relationship of local and natural varieties in Guangxi Zhuang Autonomous Region of China using start codon targeted polymorphism (SCoT) markers. The accessions of 95 diospyros germplasms belonging to four species Diospyros kaki Thunb, D. oleifera Cheng, D. kaki var. silverstris Mak, and D. lotus Linn were collected from different eco-climatic zones in Guangxi and were analyzed using SCoT markers.Results indicated that the accessions of 95 diospyros germplasms could be distinguished using SCoT markers, and were divided into three groups at similarity coefficient of 0.608; these germplasms that belong to the same species were clustered together; of these, the degree of genetic diversity of the natural D. kaki var. silverstris Mak population was richest among the four species; the geographical distance showed that the 12 natural populations of D. kaki var. silverstris Mak were divided into two groups at similarity coefficient of 0.19. Meanwhile, in order to further verify the stable and useful of SCoT markers in diospyros germplasms, SSR markers were also used in current research to analyze the genetic diversity and relationship in the same diospyros germplasms. Once again, majority of germplasms that belong to the same species were clustered together. Thus SCoT markers were stable and especially useful for analysis of the genetic diversity and relationship in diospyros germplasms.The molecular characterization and diversity assessment of diospyros were very important for conservation of diospyros germplasm resources, meanwhile for diospyros improvement.
Project description:The continued investigation of the middle Miocene palynoflora from the Lavanttal Basin reveals numerous additional angiosperm taxa. The Myrtales to Ericales pollen record documented here comprises 46 different taxa belonging to Onagraceae (Ludwigia), Ericaceae (Craigia, Reevesia, Tilia), Anacardiaceae (Pistacia), Rutaceae (Zanthoxylum), Sapindaceae (Acer), Santalaceae (Arceuthobium), Amaranthaceae, Caryophyllaceae, Polygonaceae (Persicaria, Rumex), Cornaceae (Alangium, Cornus, Nyssa), Ebenaceae (Diospyros), Ericaceae (Andromeda, Arbutus, Empetrum, Erica), Sapotaceae (Pouteria, Sideroxylon), Styracaceae (Rehderodendron) and Symplocaceae (Symplocos). Köppen signatures of potential modern analogues of the additional fossil woody elements confirm the hypothesis of a subtropical (Cfa, Cwa) climate at lower elevations and subsequent transition into a temperate climate with altitudinal succession (Cfa ? Cfb/Dfa ? Dfb; Cwa ? Cwb ? Dwb-climate). The fossil plants represent different vegetation units, from wetland lowlands to well-drained montane forests. Many of the fossil taxa have potential modern analogues that can be classified as nemoral and/or meridio-nemoral and/or semihumid-meridional vegetation elements. New is the recognition of oreotropical elements, which are direct indicators for a substantial altitudinal gradient.
Project description:The plastome sequences of <i>Diospyros blancoi</i> A. DC. (Ebenaceae) were completed in this study (NCBI acc. no. KX426216). The gene order and structure of the <i>D. blancoi</i> plastome are collinear with the typical plastome of land plants. The complete plastome size is 157,745?bp in length and consists of a large single-copy region of 87,246?bp and a small single-copy region of 18,323?bp, which are separated by a pair of 26,088?bp-long inverted repeat regions. The overall A-T content of the plastome sequence is 62.6%. The plastome contains 113 genes, of which 79 are protein-coding genes, 30 are tRNA genes, and 4 are rRNA genes. Sixteen genes contain one intron and two genes have two introns. A total of 45 simple sequence loci were identified from the genome. Phylogenetic analysis revealed that <i>D. blancoi</i> is a sister group of Primulaceae with 100% bootstrap support.
Project description:The white-thread blight and black rot (WTBR) caused by basidiomycetous fungi of the genus Ceratobasidium is emerging as an important plant disease in Brazil, particularly for crop species in the Ericales such as persimmon (Diospyros kaki) and tea (Camellia sinensis). However, the species identity of the fungal pathogen associated with either of these hosts is still unclear. In this work, we used sequence variation in the internal transcribed spacer regions, including the 5.8S coding region of rDNA (ITS-5.8S rDNA), to determine the phylogenetic placement of the local white-thread-blight-associated populations of Ceratobasidium sp. from persimmon and tea, in relation to Ceratobasidium species already described world-wide. The two sister populations of Ceratobasidium sp. from persimmon and tea in the Brazilian Atlantic Forest agroecosystem most likely represent distinct species within Ceratobasidium and are also distinct from C. noxium, the etiological agent of the first description of white-thread blight disease that was reported on coffee in India. The intraspecific variation for the two Ceratobasidium sp. populations was also analyzed using three mitochondrial genes (ATP6, nad1 and nad2). As reported for other fungi, variation in nuclear and mitochondrial DNA was incongruent. Despite distinct variability in the ITS-rDNA region these two populations shared similar mitochondrial DNA haplotypes.
Project description:Camellia oleifera is one of the four largest woody edible oil plants in the world with high ecological and medicinal values. Due to frequent interspecific hybridization, it was difficult to study its genetics and evolutionary history. This study used C. oleifera that was collected on Hainan Island to conduct our research. The unique island environment makes the quality of tea oil higher than that of other species grown in the mainland. Moreover, a long-term geographic isolation might affect gene structure. In order to better understand the molecular biology of this species, protect excellent germplasm resources, and promote the population genetics and phylogenetic studies of Camellia plants, high-throughput sequencing technology was used to obtain the chloroplast genome sequence of Hainan C. oleifera. The results showed that the whole chloroplast genome of C. oleifera in Hainan was 156,995 bp in length, with a typical quadripartite structure of a large single copy (LSC) region of 86,648 bp, a small single copy (SSC) region of 18,297 bp, and a pair of inverted repeats (IRs) of 26,025 bp. The whole genome encoded a total of 141 genes (115 different genes), including 88 protein-coding genes, 45 tRNA genes, and eight rRNA genes. Among these genes, nine genes contained one intron, two genes contained two introns, and four overlapping genes were also detected. The total GC content of Hainan C. oleifera's chloroplast genome was 37.29%. The chloroplast genome structure characteristics of Hainan C. oleifera were compared with mainland C. oleifera and those of the other eight closely related Theaceae species; it was found that the contractions and expansions of the IR/LSC and IR/SSC regions affected the length of chloroplast genome. The chloroplast genome sequences of these Theaceae species were highly similar. A comparative analysis indicated that the Theaceae species were conserved in structure and evolution. A total of 51 simple sequence repeat (SSR) loci were detected in the chloroplast genome of Hainan C. oleifera, and all Camellia plants did not have pentanucleotide repeats, which could be used as a good marker in phylogenetic studies. We also detected seven long repeats, the base composition of all repeats was biased toward A/T, which was consistent with the codon bias. It was found that Hainan C. oleifera had a similar evolutionary relationship with C. crapnelliana, through the use of codons and phylogenetic analysis. This study can provide an effective genomic resource for the evolutionary history of Theaceae family.
Project description:Argania spinosa (Sapotaceae), an important endemic Moroccan oil tree, is a primary source of argan oil, which has numerous dietary and medicinal proprieties. The plant species occupies the mid-western part of Morocco and provides great environmental and socioeconomic benefits. The complete chloroplast (cp) genome of A. spinosa was sequenced, assembled, and analyzed in comparison with those of two Sapotaceae members. The A. spinosa cp genome is 158,848 bp long, with an average GC content of 36.8%. The cp genome exhibits a typical quadripartite and circular structure consisting of a pair of inverted regions (IR) of 25,945 bp in length separating small single-copy (SSC) and large single-copy (LSC) regions of 18,591 and 88,367 bp, respectively. The annotation of A. spinosa cp genome predicted 130 genes, including 85 protein-coding genes (CDS), 8 ribosomal RNA (rRNA) genes, and 37 transfer RNA (tRNA) genes. A total of 44 long repeats and 88 simple sequence repeats (SSR) divided into mononucleotides (76), dinucleotides (7), trinucleotides (3), tetranucleotides (1), and hexanucleotides (1) were identified in the A. spinosa cp genome. Phylogenetic analyses using the maximum likelihood (ML) method were performed based on 69 protein-coding genes from 11 species of Ericales. The results confirmed the close position of A. spinosa to the Sideroxylon genus, supporting the revisiting of its taxonomic status. The complete chloroplast genome sequence will be valuable for further studies on the conservation and breeding of this medicinally and culinary important species and also contribute to clarifying the phylogenetic position of the species within Sapotaceae.