Project description:DNA methylation plays a fundamental role in the control of gene expression and genome integrity. Although there are multiple tools that enable its detection from Nanopore sequencing, their accuracy remains largely unknown. Here, we present a systematic benchmarking of tools for the detection of CpG methylation from Nanopore sequencing using individual reads, control mixtures of methylated and unmethylated reads, and bisulfite sequencing. We found that tools have a tradeoff between false positives and false negatives and present a high dispersion with respect to the expected methylation frequency values. We described various strategies to improve the accuracy of these tools, including a consensus approach, METEORE ( https://github.com/comprna/METEORE ), based on the combination of the predictions from two or more tools that shows improved accuracy over individual tools. Snakemake pipelines are also provided for reproducibility and to enable the systematic application of our analyses to other datasets.
Project description:Epigenetic characterization of cell-free DNA (cfDNA) is an emerging approach for detecting and characterizing diseases such as cancer. We developed a strategy using nanopore-based single-molecule sequencing to measure cfDNA methylomes. This approach generated up to 200 million reads for a single cfDNA sample from cancer patients, an order of magnitude improvement over existing nanopore sequencing methods. We developed a single-molecule classifier to determine whether individual reads originated from a tumor or immune cells. Leveraging methylomes of matched tumors and immune cells, we characterized cfDNA methylomes of cancer patients for longitudinal monitoring during treatment.
Project description:An optimized, well-tested and validated targeted genomic sequencing-based high-throughput assay is currently not available ready for routine biodefense and biosurveillance applications. Earlier, we addressed this gap by developing and establishing baseline comparisons of a multiplex end-point Polymerase Chain Reaction (PCR) assay followed by Oxford Nanopore Technology (ONT) based amplicon sequencing to real time PCR and customized data processing. Here, we expand upon this effort by identifying the optimal ONT library preparation method for integration into a novel software platform ONT-DART (ONT-Detection of Amplicons in Real-Time). ONT-DART is a dockerized, real-time, amplicon-sequence analysis workflow that is used to reproducibly process and filter read data to support actionable amplicon detection calls based on alignment metrics, within sample statistics, and no-template control data. This analysis pipeline was used to compare four ONT library preparation protocols using R9 and Flongle (FL) flow cells. The two 4-Primer methods tested required the shortest preparation times (5.5 and 6.5 h) for 48 libraries but provided lower fidelity data. The Native Barcoding and Ligation methods required longer preparation times of 8 and 12 h, respectively, and resulted in higher overall data quality. On average, data derived from R9 flow cells produced true positive calls for target organisms more than twice as fast as the lower throughput FL flow cells. These results suggest that utilizing the R9 flowcell with an ONT Native Barcoding amplicon library method in combination with ONT-DART platform analytics provides the best sequencing-based alternative to current PCR-based biodetection methods.
Project description:Modifications are present on many classes of RNA, including tRNA, rRNA, and mRNA. These modifications modulate diverse biological processes such as genetic recoding and mRNA export and folding. In addition, modifications can be introduced to RNA molecules using chemical probing strategies that reveal RNA structure and dynamics. Many methods exist to detect RNA modifications by short-read sequencing; however, limitations on read length inherent to short-read-based methods dissociate modifications from their native context, preventing single-molecule modification analysis. Here, we demonstrate direct RNA nanopore sequencing to detect endogenous and exogenous RNA modifications on long RNAs at the single-molecule level. We detect endogenous 2'-O-methyl and base modifications across E. coli and S. cerevisiae ribosomal RNAs as shifts in current signal and dwell times distally through interactions with the helicase motor protein. We further use the 2'-hydroxyl reactive SHAPE reagent acetylimidazole to probe RNA structure at the single-molecule level with readout by direct nanopore sequencing.
Project description:MotivationThe Oxford Nanopore technology has a great potential for the analysis of methylated motifs in genomes, including whole-genome methylome profiling. However, we found that there are no methylation motifs detection algorithms, which would be sensitive enough and return deterministic results. Thus, the MEME suit does not extract all Helicobacter pylori methylation sites de novo even using the iterative approach implemented in the most up-to-date methylation analysis tool Nanodisco.ResultsWe present Snapper, a new highly sensitive approach, to extract methylation motif sequences based on a greedy motif selection algorithm. Snapper does not require manual control during the enrichment process and has enrichment sensitivity higher than MEME coupled with Tombo or Nanodisco instruments that was demonstrated on H.pylori strain J99 studied earlier by the PacBio technology and on four external datasets representing different bacterial species. We used Snapper to characterize the total methylome of a new H.pylori strain A45. At least four methylation sites that have not been described for H.pylori earlier were revealed. We experimentally confirmed the presence of a new CCAG-specific methyltransferase and inferred a gene encoding a new CCAAK-specific methyltransferase.Availability and implementationSnapper is implemented using Python and is freely available as a pip package named "snapper-ont." Also, Snapper and the demo dataset are available in Zenodo (10.5281/zenodo.10117651).
Project description:We describe a novel single molecule nanopore-based sequencing by synthesis (Nano-SBS) strategy that can accurately distinguish four bases by detecting 4 different sized tags released from 5'-phosphate-modified nucleotides. The basic principle is as follows. As each nucleotide is incorporated into the growing DNA strand during the polymerase reaction, its tag is released and enters a nanopore in release order. This produces a unique ionic current blockade signature due to the tag's distinct chemical structure, thereby determining DNA sequence electronically at single molecule level with single base resolution. As proof of principle, we attached four different length PEG-coumarin tags to the terminal phosphate of 2'-deoxyguanosine-5'-tetraphosphate. We demonstrate efficient, accurate incorporation of the nucleotide analogs during the polymerase reaction, and excellent discrimination among the four tags based on nanopore ionic currents. This approach coupled with polymerase attached to the nanopores in an array format should yield a single-molecule electronic Nano-SBS platform.
Project description:To detect the modifed bases in SINEUP RNA, we compared chemically modified in vitro transcribed (IVT) SINEUP-GFP RNA and in-cell transcribed (ICT) SINEUP RNA from SINEUP-GFP and sense EGFP co-transfected HEK293T/17 cells. Comparative study of Nanopore direct RNA sequencing data from non-modified and modified IVT samples against the data from ICT SINEUP RNA sample revealed modified k-mers positions in SINEUP RNA in the cell.
Project description:Fascioscapulohumeral muscular dystrophy (FSHD) is caused by a unique genetic mechanism that relies on contraction and hypomethylation of the D4Z4 macrosatellite array on the Chromosome 4q telomere allowing ectopic expression of the DUX4 gene in skeletal muscle. Genetic analysis is difficult because of the large size and repetitive nature of the array, a nearly identical array on the 10q telomere, and the presence of divergent D4Z4 arrays scattered throughout the genome. Here, we combine nanopore long-read sequencing with Cas9-targeted enrichment of 4q and 10q D4Z4 arrays for comprehensive genetic analysis including determination of the length of the 4q and 10q D4Z4 arrays with base-pair resolution. In the same assay, we differentiate 4q from 10q telomeric sequences, determine A/B haplotype, identify paralogous D4Z4 sequences elsewhere in the genome, and estimate methylation for all CpGs in the array. Asymmetric, length-dependent methylation gradients were observed in the 4q and 10q D4Z4 arrays that reach a hypermethylation point at approximately 10 D4Z4 repeat units, consistent with the known threshold of pathogenic D4Z4 contractions. High resolution analysis of individual D4Z4 repeat methylation revealed areas of low methylation near the CTCF/insulator region and areas of high methylation immediately preceding the DUX4 transcriptional start site. Within the DUX4 exons, we observed a waxing/waning methylation pattern with a 180-nucleotide periodicity, consistent with phased nucleosomes. Targeted nanopore sequencing complements recently developed molecular combing and optical mapping approaches to genetic analysis for FSHD by adding precision of the length measurement, base-pair resolution sequencing, and quantitative methylation analysis.
Project description:Fascioscapulohumeral muscular dystrophy (FSHD) is caused by a unique genetic mechanism that relies on contraction and hypomethylation of the D4Z4 macrosatellite array on the chromosome 4q telomere allowing ectopic expression of the DUX4 gene in skeletal muscle. Genetic analysis is difficult due to the large size and repetitive nature of the array, a nearly identical array on the 10q telomere, and the presence of divergent D4Z4 arrays scattered throughout the genome. Here, we combine nanopore long-read sequencing with Cas9-targeted enrichment of 4q and 10q D4Z4 arrays for comprehensive genetic analysis including determination of the length of the 4q and 10q D4Z4 arrays with base-pair resolution. In the same assay, we differentiate 4q from 10q telomeric sequences, determine A/B haplotype, identify paralogous D4Z4 sequences elsewhere in the genome, and estimate methylation for all CpGs in the array. Asymmetric, length-dependent methylation gradients were observed in the 4q and 10q D4Z4 arrays that reach a hypermethylation point at approximately 10 D4Z4 repeat units, consistent with the known threshold of pathogenic D4Z4 contractions. High resolution analysis of individual D4Z4 repeat methylation revealed areas of low methylation near the CTCF/insulator region and areas of high methylation immediately preceding the DUX4 transcriptional start site. Within the DUX4 exons, we observed a waxing/waning methylation pattern with a 180-nucleotide periodicity, consistent with phased nucleosomes. Targeted nanopore sequencing complements recently developed molecular combing and optical mapping approaches to genetic analysis for FSHD by adding precision of the length measurement, base-pair resolution sequencing, and quantitative methylation analysis.