Project description:We used PacBio data to identify more reliable transcripts from hESC, based on which we can estimate gene/transcript abundance better from Illumina data. PacBio long reads and Illumina short reads were generated from the same hESC cell line H1. PacBio reads were error-corrected by Illumina reads to identify transcripts. rSeq is used to estimate gene/transcript abundance of the identified transcriptome.
Project description:PacBio SMRTseq long reads and Illumina short reads of pig testis, epididymis, vesicular gland, prostate gland, and bulbourethral gland
Project description:Since short reads from Illumina RNA-seq data are challenging to map to repetitive elements , we wanted to confirm the bulk RNA-seq findings using an orthogonal method, namely, using the long read technology of Pacific Biosciences (PacBio) full-length transcriptome sequencing. This dataset provided around 1.1 (WT) and 1.3 (RBM4 KO) million sequence reads of 2.6 kb average length mapping to the human genome.
2020-10-20 | GSE147896 | GEO
Project description:Genome of Hygrophila difformis
Project description:We used an approach combining PacBio data and published Illumina reads to de novo assemble D. busckii contigs. We generated Hi-C data from D. busckii embryos to order these contigs into chromosome-length scaffolds. For D. virilis we generated Hi-C data to order and orient the published Dvir_caf1 scaffolds into chromosome-length assemblies. Furthermore, we compared Hi-C matrices from these two new assemblies with D. melanogaster with respect to synteny blocks and dosage compensation as a chromosome-wide gene-regulatory mechanism.
Project description:Pioneering studies (PXD014844) have identified many interesting molecules in tick saliva by LC-MS/MS proteomics, but the protein databases used to assign mass spectra were based on short Illumina reads of the Amblyomma americanum transcriptome and may not have captured the diversity and complexity of longer transcripts. Here we apply long-read Pacific Bioscience technologies to complement the previously reported short-read Illumina transcriptome-based proteome in an effort to increase spectrum assignments. Our dataset reveals a small increase in assignable spectra to supplement the previously released short-read transcriptome-based proteome.
Project description:Pioneering studies (PXD014844) have identified many interesting molecules by LC-MS/MS proteomics, but the protein databases used to assign mass spectra were based on short Illumina reads of the Amblyomma americanum transcriptome and may not have captured the diversity and complexity of longer transcripts. Here we apply long-read Pacific Bioscience technologies to complement the previously reported short-read Illumina transcriptome-based proteome in an effort to increase spectrum assignments. Our dataset reveals a small increase in assignable spectra to supplement previously released short-read transcriptome-based proteome.
Project description:We used an approach combining PacBio data and published Illumina reads to de novo assemble D. busckii contigs. We generated Hi-C data from D. busckii embryos to order these contigs into chromosome-length scaffolds. For D. virilis we generated Hi-C data to order and orient the published Dvir_caf1 scaffolds into chromosome-length assemblies. Furthermore, we compared Hi-C matrices from these two new assemblies with D. melanogaster with respect to synteny blocks and dosage compensation as a chromosome-wide gene-regulatory mechanism.
Project description:<p class='ql-align-justify'>Megasphaera hexanoica KCCM 43214T, isolated from cow rumen, is capable of producing medium-chain carboxylic acids such as n-caproate and n-caprylate. In this study, we present a high-quality genome assembly, along with intracellular metabolomic profiling and pangenomic analysis. Illumina sequencing generated 2.3 Mbp from 15,293,634 reads with a GC content of 49.5%, while PacBio HiFi sequencing produced 331.5 Mbp across 45,266 reads, with an average read length of 7,323 bp and a HiFi read N50 of 8,214 bp. Hybrid assembly of short and long reads resulted in a single 2.88 Mbp contig, containing 2,075-2,083 unique genes. A genome-scale metabolic model was constructed, to evaluate its metabolic capabilities under specific growth conditions. Intracellular metabolomic analysis of cells grown in fructose medium and lactate medium revealed key metabolic activities associated with chain elongation. Pangenomic analysis across nine annotated genomes identified 6,721 orthologous gene using OrthoMCL, emphasizing the genetic and functional diversity within the Megasphaera genus. This dataset offers valuable insights into the metabolism and biotechnological potential of M. hexanoica KCCM 43214T.</p>