Project description:<p><strong>BACKGROUND:</strong> Traditional Chinese medicine has used <em>Peucedanum praeruptorum</em> Dunn (Apiaceae) for a long time. Various coumarins, including the significant root constituents Praeruptorin (A-E), are the active constituents of the dried roots of P. praeruptorum. Previous transcriptomic and metabolomic studies attempted to elucidate the distribution and biosynthetic network of these medicinal-valuable compounds. However, the lack of a high-quality reference genome impedes an in-depth understanding of genetic traits and, thus, the development of better breeding strategies.</p><p><strong>RESULTS:</strong> The authors assembled a telomere-to-telomere genome by combining PacBio HiFi, ONT ultra- long and Hi-C data. The final genome assembly was approximately 1.798 Gb, assigned to 11 chromosomes and genome completeness >98%. Comparative genomic analysis suggested that <em>P. praeruptorum</em> experienced two WGD events like the ones in the Apiaceae family. By the transcriptomic and metabolomic analysis of the coumarin metabolic pathway, we presented coumarins' spatial and temporal distribution and the expression patterns of critical genes for its biosynthesis. Notably, the <em>COSY</em> and cytochrome <em>P450</em> genes showed tandem duplications on several chromosomes, which may be responsible for the high accumulation of coumarins.</p><p><strong>CONCLUSIONS:</strong> The authors obtained a T2T genome for <em>P. praeruptorum</em>, which provides molecular insights into the chromosomal distribution of the coumarin biosynthetic genes. This high-quality genome is an essential resource for designing engineering strategies for improving the production of these valuable compounds.</p>
Project description:The complete assembly of vast and complex plant genomes, like the hexaploid wheat genome, remains challenging. Here, we present CS-IAAS, a comprehensive telomere-to-telomere (T2T) gap-free Triticum aestivum L. reference genome, encompassing 14.51 billion base pairs and featuring all 21 centromeres and 42 telomeres. Annotation revealed 90.8 Mb additional centromeric satellite arrays and 5,611 ribosomal DNA(rDNA) units. Genome-wide rearrangements, centromeric elements, TE expansion, and segmental duplications were deciphered during tetraploidization and hexaploidization, providing a comprehensive understanding of wheat subgenome evolution. Among them, TE insertions during hexaploidization greatly influenced gene expression balances, thus increasing the genome plasticity of transcriptional levels. Additionally, we generated 163,329 full-length cDNA sequences and proteomic data that helped annotate 141,035 high-confidence (HC) protein-coding genes. However, in such a hexaploidy genome, 20.05%, 33.43%, and 42.76% of gene transcript levels, alternative splicing events, and protein levels were detected unbalancing among subgenomes. The complete T2T reference genome (CS-IAAS), along with its transcriptome and proteome, represents a significant step in our understanding of wheat genome complexity, and provides insights for future wheat research and breeding.
Project description:The methylation landscape of the cattle Y-chromosome was characterized using methylated cytosine data produced from PacBio and ONT long reads sequencing platforms.
Project description:Recent completion of the telomere-to-telomere (T2T) genome assembly has enabled a comprehensive characterization of pericentromeric SatⅠ, SatII, SatⅢ and centromeric α-satellite repeats. SatⅢ DNA constitutes ~1.56% of the genome with a reported localization across 16 chromosomes. The transcription activity of SatⅢ DNA across genome and the sequence of SatⅢ transcripts remained largely unclear. We performed nanopore long-read RNA sequencing (RNA-seq) in untreated (UN), sodium arsenite (SA: 0.1mM, 5 h) and heat shock (HS: 42°C, 2 h; 37°C, 1 h) stressed HeLa cells to characterize SatⅢ transcripts . Since a portion of SatⅢ transcripts is non-polyadenylated, We performed polyadenylated (poly(A)+) and rRNA-depleted (ribo-) nanopore cDNA long-read RNA-seq.