Project description:We describe an improved individual nucleotide resolution CLIP protocol (iiCLIP), which can be completed within 4 days from UV crosslinking to libraries for sequencing. For benchmarking, we directly compared PTBP1 iiCLIP libraries with the iCLIP2 protocol produced under standardised conditions with 1 million HEK293 cells, and with public eCLIP and iCLIP PTBP1 data. There are 3 PTBP1 iiCLIP libraries, 1 input iiCLIP library and 1 PTBP1 iCLIP2 library produced in this study.
Project description:With its capacity for high-resolution data output in one region of interest, chromosome conformation capture combined with high-throughput sequencing (4C-seq) is a state-of-the-art next-generation sequencing technique that provides epigenetic insights, and regularly advances current medical research. However, 4C-seq data is complex and prone to biases, and while specialized programs exist, an unbiased, extensive benchmarking is still lacking. Furthermore, neither substantial datasets with fully characterized ground truth, nor simulation programs for realistic 4C-seq data have been published. We conducted a benchmarking study on 54 4C-seq samples from 12 datasets, including original murine BMM, T-cell, and 416B data, and developed a novel 4C-seq simulation software to allow for more detailed comparisons of 4C-seq algorithms on 50 simulated datasets with 10 to 120 samples each.
Project description:Sample index hopping refers to the incorrect sample assignment of a demultiplexed sequencing read in a library pool. To enable benchmarking of methods for measurement of index hopping rate and removal of its artifacts in single-cell RNA-seq data, we developed a validation dataset consisting of a multiplexed library of two samples, in which the true sample of origin of most reads are known. The reads with known sample of origin provide the ground truth for measuring the performance of index hopping correcting methods.
Project description:RNA-sequencing has become the gold standard for whole-transcriptome gene expression quantification. Multiple algorithms have been developed to derive gene counts from sequencing reads. While a number of benchmarking studies have been conducted, the question remains how individual methods perform at accurately quantifying gene expression levels from RNA-sequencing reads. We performed an independent benchmarking study using RNA-sequencing data from the well established MAQCA and MAQCB reference samples. RNA-sequencing reads were processed using five popular workflows (Tophat-HTSeq, Tophat-Cufflinks, STAR-HTSeq, Kallisto and Salmon) and resulting gene expression measurements were compared to expression data generated by wet-lab validated qPCR assays for all protein coding genes. All methods showed high gene expression rank correlations with qPCR data. When comparing gene expression fold changes between MAQCA and MAQCB samples, about 85% of the genes showed consistent results between RNA-sequencing and qPCR data. Of note, each method revealed a small but specific set of genes with inconsistent expression measurements. A significant proportion of these method-specific inconsistent genes were reproducibly identified in independent datasets. These genes were typically smaller, had fewer exons and were lower expressed compared to genes with consistent expression measurements. We propose that careful validation is warranted when evaluating RNA-seq based expression profiles for this specific set of genes.
2017-01-31 | GSE83402 | GEO
Project description:Benchmarking somatic variant calling with long-read data on mitochondrial DNA
Project description:Brain organoids (BO) enabled the investigation of human corticogenesis in-vitro with an increasing range of protocols achieving its remarkable recapitulation. However, we lack a resource gathering fetal cortex-specific gene co-expression patterns and their behavior in BO. We complement the current knowledge with a benchmarking of BO versus human corticogenesis, integrating: transcriptomes from in-house differentiated cortical BO (CBO), in-house processed human fetal brain samples, analysis of transcriptomes from different BO systems and of pre-natal cortical samples from the BrainSpan Atlas.