Project description:<p><strong>BACKGROUND:</strong> Traditional Chinese medicine has used <em>Peucedanum praeruptorum</em> Dunn (Apiaceae) for a long time. Various coumarins, including the significant root constituents Praeruptorin (A-E), are the active constituents of the dried roots of P. praeruptorum. Previous transcriptomic and metabolomic studies attempted to elucidate the distribution and biosynthetic network of these medicinal-valuable compounds. However, the lack of a high-quality reference genome impedes an in-depth understanding of genetic traits and, thus, the development of better breeding strategies.</p><p><strong>RESULTS:</strong> The authors assembled a telomere-to-telomere genome by combining PacBio HiFi, ONT ultra- long and Hi-C data. The final genome assembly was approximately 1.798 Gb, assigned to 11 chromosomes and genome completeness >98%. Comparative genomic analysis suggested that <em>P. praeruptorum</em> experienced two WGD events like the ones in the Apiaceae family. By the transcriptomic and metabolomic analysis of the coumarin metabolic pathway, we presented coumarins' spatial and temporal distribution and the expression patterns of critical genes for its biosynthesis. Notably, the <em>COSY</em> and cytochrome <em>P450</em> genes showed tandem duplications on several chromosomes, which may be responsible for the high accumulation of coumarins.</p><p><strong>CONCLUSIONS:</strong> The authors obtained a T2T genome for <em>P. praeruptorum</em>, which provides molecular insights into the chromosomal distribution of the coumarin biosynthetic genes. This high-quality genome is an essential resource for designing engineering strategies for improving the production of these valuable compounds.</p>
Project description:The complete assembly of vast and complex plant genomes, like the hexaploid wheat genome, remains challenging. Here, we present CS-IAAS, a comprehensive telomere-to-telomere (T2T) gap-free Triticum aestivum L. reference genome, encompassing 14.51 billion base pairs and featuring all 21 centromeres and 42 telomeres. Annotation revealed 90.8 Mb additional centromeric satellite arrays and 5,611 ribosomal DNA(rDNA) units. Genome-wide rearrangements, centromeric elements, TE expansion, and segmental duplications were deciphered during tetraploidization and hexaploidization, providing a comprehensive understanding of wheat subgenome evolution. Among them, TE insertions during hexaploidization greatly influenced gene expression balances, thus increasing the genome plasticity of transcriptional levels. Additionally, we generated 163,329 full-length cDNA sequences and proteomic data that helped annotate 141,035 high-confidence (HC) protein-coding genes. However, in such a hexaploidy genome, 20.05%, 33.43%, and 42.76% of gene transcript levels, alternative splicing events, and protein levels were detected unbalancing among subgenomes. The complete T2T reference genome (CS-IAAS), along with its transcriptome and proteome, represents a significant step in our understanding of wheat genome complexity, and provides insights for future wheat research and breeding.
Project description:The methylation landscape of the cattle Y-chromosome was characterized using methylated cytosine data produced from PacBio and ONT long reads sequencing platforms.
Project description:Purpose: The aim of this study is to compare different long-read sequencing platforms using reference lung adenocarcinoma cell lines and spike-in controls. Methods - Cell Culture: Lung adenocarcinoma cell lines NCI-H1975 and HCC827 from a range of passages (2-4) were grown on 3 separate occasions in Roswell Park Memorial Institute (RPMI) 1640 medium with 10% fetal calf serum and 1% penicillin-streptomycin. Methods - RNA preparation: mRNA was extracted using a Qiagen RNA miniprep kit and purified using the NEBNext® Poly(A) mRNA Magnetic Isolation Module (E7490). Purified mRNA spiked with sequins was used for Next Generation Sequencing library preparation using the NEBNext Ultra II Directional RNA Library Prep Kit (Illumina) and the cRNA-PCR Barcoding (SQK-PCS109 with SQK-PBK004) kit (ONT). Completed libraries were sequenced on NextSeq 500 (Illumina) and PromethION (ONT). Iso-Seq libraries were prepared and sequenced by Novogene on Sequel II (PacBio). Reads were mapped to known genomic features of the GRCH38 reference genome and RNA sequin decoy chromosome combined sequences at the gene-level and single reads were then summarized into gene-level counts using featureCounts software (Liao et al. 2014).