Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Insert-seq

ABSTRACT: Insert-seq enables high resolution mapping of genomically integrated DNA using long read technologies

PROVIDER: PRJEB46760 | ENA |

REPOSITORIES: ENA

ACCESS DATA

Json Xml

Similar Datasets

Long Insert Mouse Genomes

Project description:Long Insert Mouse Genomes

| PRJEB2200 | ENA

Enhanced whole exome sequencing by higherDNA insert lengths

Project description:Background: Whole exome sequencing (WES) has been proven to serve as a valuable basis for various applications such as variant calling and copy number variation (CNV) analyses. For those analyses the read coverage should be optimally balanced throughout protein coding regions at sufficient read depth. Unfortunately, WES is known for its uneven coverage within coding regions due to GC-rich regions or off-target enrichment. Results: In order to examine the irregularities of WES within genes, we applied Agilent SureSelectXT exome capture on human samples and sequenced these via Illumina in 2x101 paired-end mode. As we suspected the sequenced insert length to be crucial in the uneven coverage of exome captured samples, we sheared 12 genomic DNA samples to two different DNA insert size lengths, namely 130 and 170 bp. Interestingly, although mean coverages of target regions were clearly higher in samples of 130 bp insert length, the level of evenness was more pronounced in 170 bp samples. Moreover, merging overlapping paired-end reads revealed a positive effect on evenness indicating overlapping reads as another reason for the unevenness. In addition, mutation analysis on a subset of the samples was performed. In these isogenic subclones almost twofold mutations were failed in the 130 bp samples when compared to the 170 bp samples. Visual inspection of the discarded mutation sites exposed low coverages at the sites embedded in high amplitudes of coverage depth in the affected region. Conclusions: Producing longer insert reads could be a good strategy to achieve better uniform read coverage in coding regions and hereby enhancing the effective sequencing yield to provide an improved basis for further variant calling and CNV analyses.

2016-07-28 | E-MTAB-4527 | biostudies-arrayexpress

Breakpoint detection using long insert whole genome sequencing

Project description:<p>In this study, we hypothesize that shallow long insert whole genome sequencing (LI-WGS) increases our power for detecting breakpoints compared to shallow short insert WGS. We performed a priori analyses to demonstrate the benefits of LI-WGS, developed a long insert library preparation protocol based off Illumina's protocol, and compared LI-WGS against short insert WGS on test samples. We then used long insert WGS to identify translocations and copy number changes in tumor and germline samples collected from cancer patients with different malignancies.</p>

| phs000646.v1.p1 | EGA

Evaluation of copy number variation detection between high-resolution array CGH and low-coverage short-insert and mate-pair whole-genome sequencing

Project description:In principle, whole-genome sequencing (WGS) of the human genome even at low coverage offers higher resolution for genomic copy number variation (CNV) detection compared to array-based technologies, which is currently the first-tier approach in clinical cytogenetics. There are, however, obstacles in replacing array-based CNV detection with that of low-coverage WGS such as cost, turnaround time, and lack of systematic performance comparisons. With technological advances in WGS in terms of library preparation, instrument platforms, and data analysis algorithms, obstacles imposed by cost and turnaround time are fading. However, a systematic performance comparison between array and low-coverage WGS-based CNV detection has yet to be performed. Here, we compared the CNV detection capabilities between WGS (short-insert, 3kb-, and 5kb-mate-pair libraries) at 1X, 3X, and 5X coverages and standardly used high-resolution arrays in the genome of 1000-Genomes-Project CEU genome NA12878. CNV detection was performed using standard analysis methods, and the results were then compared to a list of Gold Standard NA12878 CNVs distilled from the 1000-Genomes Project. Overall, low-coverage WGS is able to detect drastically more (approximately 5 fold more on average) Gold Standard CNVs compared to arrays and is accompanied with fewer CNV calls without secondary validation. Furthermore, we also show that WGS (at ≥1X coverage) is able to detect all seven validated deletions larger than 100 kb in the NA12878 genome whereas only one of such deletions is detected in most arrays. Finally, we show that the much larger 15 Mbp Cri-du-chat deletion can be clearly seen at even 1X coverage from short-insert WGS.

2018-06-26 | GSE105092 | GEO

Multiple insert size paired-end sequencing for deconvolution of complex transcriptomes

Project description:Deep sequencing of transcriptomes allows quantitative and qualitative analysis of many RNA species in a sample, with parallel comparison of expression levels, splicing variants, natural antisense transcripts, RNA editing and transcriptional start and stop sites the ideal goal. By computational modeling, we show how libraries of multiple insert sizes combined with strand-specific, paired-end (SS-PE) sequencing can increase the information gained on alternative splicing, especially in higher eukaryotes. Despite the benefits of gaining SS-PE data with paired ends of varying distance, the standard Illumina protocol allows only non-strand-specific, paired-end sequencing with a single insert size. Here, we modify the Illumina RNA ligation protocol to allow SS-PE sequencing by using a custom pre-adenylated 3’ adaptor. We generate parallel libraries with differing insert sizes to aid deconvolution of alternative splicing events and to characterize the extent and distribution of natural antisense transcription in C. elegans. Despite stringent requirements for detection of alternative splicing, our data increases the number of intron retention and exon skipping events annotated in the Wormbase genome annotations by 127 % and 121 %, respectively. We show that parallel libraries with a range of insert sizes increase transcriptomic information gained by sequencing and that by current established benchmarks our protocol gives competitive results with respect to library quality.

2012-08-31 | GSE40507 | GEO

High-resolution transcriptome analysis with long-read RNA sequencing

Project description:Ongoing improvements to next generation sequencing technologies are leading to longer sequencing read lengths, but a thorough understanding of the impact of longer reads on RNA sequencing analyses is lacking. To address this issue, we generated and compared two RNA sequencing datasets of differing read lengths -- 2x75 bp (L75) and 2x262 bp (L262) -- and investigated the impact of read length on various aspects of analysis, including the performance of currently available read-mapping tools, gene and transcript quantification, and detection of allele-specific expression patterns. Our results indicate that, while the scalability of read-mapping tools and the cost-effectiveness of long read protocol is an issue that requires further attention, longer reads enable more accurate quantification of diverse aspects of gene expression, including individual-specific patterns of allele-specific expression and alternative splicing.

2014-09-25 | GSE57862 | GEO

Single-molecule regulatory architectures captured by chromatin fiber sequencing [DNAseI-seq]

Project description:We report a method for precisely stenciling the structure of individual chromatin fibers onto their composite DNA templates using non-specific DNA N6-adenine methyltransferases. Single-molecule long-read sequencing using PacBio of these chromatin stencils enables nucleotide-resolution readout of the primary architecture of multi-kilobase chromatin fibers (Fiber-seq).

2020-06-25 | GSE146939 | GEO

Single-molecule regulatory architectures captured by chromatin fiber sequencing [Hs AdMTase-seq]

2020-06-25 | GSE146938 | GEO

Single-molecule regulatory architectures captured by chromatin fiber sequencing [Dm AdMTase-seq]

2020-06-25 | GSE146937 | GEO

Single-molecule regulatory architectures captured by chromatin fiber sequencing [Hs Fiber-seq]

2020-06-25 | GSE146941 | GEO