Project description:Inverted duplications are a common type of copy number variation (CNV) in germline and somatic genomes. Large duplications that include many genes can lead to both neurodevelopmental phenotypes in children and gene amplifications in tumors. There are several models for inverted duplication formation, most of which include a dicentric chromosome intermediate followed by breakage-fusion-bridge (BFB) cycles, but the mechanisms that give rise to the inverted dicentric chromosome in most inverted duplications remain unknown. Here we have combined high-resolution array CGH, custom sequence capture, next-generation sequencing, and long-range PCR to analyze the breakpoints of 50 nonrecurrent inverted duplications in patients with intellectual disability, autism, and congenital anomalies. Sequence analysis of breakpoint junctions reveals a normal-copy disomic spacer between inverted and non-inverted copies of the duplication. Further, short inverted repeats are present at the boundary of the disomic spacer and the inverted duplication. These data support a mechanism of inverted duplication formation whereby a chromosome with a double-strand break intrastrand pairs with itself to form a “hairpin” intermediate that, after DNA replication, produces a dicentric inverted chromosome with a disomic spacer corresponding to the site of the hairpin. We also find evidence of short insertions and inversions at inverted duplication junctions, consistent with a DNA replication-based CNV mechanism. This process can give rise to inverted duplications adjacent to terminal deletions, inverted duplications juxtaposed to translocations, and inverted duplication ring chromosomes High resolution array CGH; two-color experiment, clinical patient vs. normal control gDNA; sex mis-matched
Project description:Inverted duplications are a common type of copy number variation (CNV) in germline and somatic genomes. Large duplications that include many genes can lead to both neurodevelopmental phenotypes in children and gene amplifications in tumors. There are several models for inverted duplication formation, most of which include a dicentric chromosome intermediate followed by breakage-fusion-bridge (BFB) cycles, but the mechanisms that give rise to the inverted dicentric chromosome in most inverted duplications remain unknown. Here we have combined high-resolution array CGH, custom sequence capture, next-generation sequencing, and long-range PCR to analyze the breakpoints of 50 nonrecurrent inverted duplications in patients with intellectual disability, autism, and congenital anomalies. Sequence analysis of breakpoint junctions reveals a normal-copy disomic spacer between inverted and non-inverted copies of the duplication. Further, short inverted repeats are present at the boundary of the disomic spacer and the inverted duplication. These data support a mechanism of inverted duplication formation whereby a chromosome with a double-strand break intrastrand pairs with itself to form a “hairpin” intermediate that, after DNA replication, produces a dicentric inverted chromosome with a disomic spacer corresponding to the site of the hairpin. We also find evidence of short insertions and inversions at inverted duplication junctions, consistent with a DNA replication-based CNV mechanism. This process can give rise to inverted duplications adjacent to terminal deletions, inverted duplications juxtaposed to translocations, and inverted duplication ring chromosomes
Project description:Here we describe CapTrap-Seq, an experimental workflow designed to address the problem of reduced transcript end detection by long-read RNA sequencing methods, especially at the 5' ends. We apply CapTrap-Seq to profile transcriptomes of the human heart and brain and we compared the obtained results with other library preparation approaches. CapTrap-Seq is a platform-agnostic method and here tested the method by using 3 different long-read sequencing platforms: MinION (ONT), Sequel (PacBaio) and Sequel II (PacBio).
Project description:Interpreting the genomic and phenotypic consequences of copy number variation (CNV) is essential to understand the etiology of genetic disorders. Whereas deletion CNVs obviously lead to haploinsufficiency, duplications may cause disease through triplosensitivity, gene disruption, or gene fusion at breakpoints. The mutational spectrum of duplications has been studied at certain loci and in some cases these copy number gains are complex chromosome rearrangements involving triplications and/or inversions. However, the organization of clinically relevant duplications throughout the genome has not been investigated on a large scale. Here, we fine mapped 184 germline duplications (14.7 kb-25.3 Mb; median 532 kb) ascertained from individuals referred for diagnostic cytogenetics testing. We performed next-generation sequencing (NGS) and whole-genome sequencing (WGS) to sequence 130 breakpoints from 112 subjects with 119 CNVs and found that most (83%) were tandem duplications in direct orientation. The remainder were triplications embedded within duplications (8.4%), adjacent duplications (4.2%), insertional translocations (2.5%), or other complex rearrangements (1.7%). In addition, we predicted six in-frame fusion genes at sequenced duplication breakpoints. Four gene fusions were formed by tandem duplications, one by two interconnected duplications, and one by duplication inserted at another locus. These novel fusion genes could be related to clinical phenotypes and warrant further study. Though most duplications are positioned head-to-tail adjacent to the original locus, those that are inverted, triplicated, or inserted can disrupt or fuse genes in a manner that may not be predicted by conventional copy number analysis. Thus, interpreting the genetic consequences of duplication CNVs requires breakpoint-level analysis.
Project description:The LRGASP challenge encompasses different human, mouse, and manatee samples sequenced using multiple combinations of protocols and platforms. Different challenges will use distinct subsets of the samples for evaluation. The long-read sequencing platforms used in these challenges are the Pacific Biosciences (PacBio) Sequel II, Oxford Nanopore (ONT) MinION and PromethION. Samples will also be sequenced on the Illumina HiSeq 2500. The primary LRGASP library prep protocols are “standard” cDNA sequencing, direct RNA sequencing, R2C2, and CapTrap. Each sample will also include Lexogen SIRV-Set 4 spike-ins. We will also provide simulated PacBio and ONT data as part of the evaluations. This particular study focuses on single strand CAGE sequencing of human iPSCs, defining CAGE peaks from Illumina HiSeq 2500 (SR: 150 cycles) of two biological replicates for use in the LRGASP challenge.
Project description:Sequencing was performed to assess the ability of Nanopore direct cDNA and native RNA sequencing to characterise human transcriptomes. Total RNA was extracted from either HAP1 or HEK293 cells, and the polyA+ fraction isolated using oligodT dynabeads. Libraries were prepared using Oxford Nanopore Technologies (ONT) kits according to manufacturers instructions. Samples were then sequenced on ONT R9.4 flow cells to generate fast5 raw reads in the ONT MinKNOW software. Fast5 reads were then base-called using the ONT Albacore software to generate Fastq reads.