Characterization, correction and de novo assembly of an Oxford Nanopore genomic dataset from Agrobacterium tumefaciens.
ABSTRACT: The MinION is a portable single-molecule DNA sequencing instrument that was released by Oxford Nanopore Technologies in 2014, producing long sequencing reads by measuring changes in ionic flow when single-stranded DNA molecules translocate through the pores. While MinION long reads have an error rate substantially higher than the ones produced by short-read sequencing technologies, they can generate de novo assemblies of microbial genomes, after an initial correction step that includes alignment of Illumina sequencing data or detection of overlaps between Oxford Nanopore reads to improve accuracy. In this study, MinION reads were generated from the multi-chromosome genome of Agrobacterium tumefaciens strain LBA4404. Errors in the consensus two-directional (sense and antisense) "2D" sequences were first characterized by way of comparison with an internal reference assembly. Both Illumina-based correction and self-correction were performed and the resulting corrected reads assembled into high-quality hybrid and non-hybrid assemblies. Corrected read datasets and assemblies were subsequently compared. The results shown here indicate that both hybrid and non-hybrid methods can be used to assemble Oxford Nanopore reads into informative multi-chromosome assemblies, each with slightly different outcomes in terms of contiguity and accuracy.
Project description:BACKGROUND:Short-read sequencing technologies have made microbial genome sequencing cheap and accessible. However, closing genomes is often costly and assembling short reads from genomes that are repetitive and/or have extreme %GC content remains challenging. Long-read, single-molecule sequencing technologies such as the Oxford Nanopore MinION have the potential to overcome these difficulties, although the best approach for harnessing their potential remains poorly evaluated. RESULTS:We sequenced nine bacterial genomes spanning a wide range of GC contents using Illumina MiSeq and Oxford Nanopore MinION sequencing technologies to determine the advantages of each approach, both individually and combined. Assemblies using only MiSeq reads were highly accurate but lacked contiguity, a deficiency that was partially overcome by adding MinION reads to these assemblies. Even more contiguous genome assemblies were generated by using MinION reads for initial assembly, but these assemblies were more error-prone and required further polishing. This was especially pronounced when Illumina libraries were biased, as was the case for our strains with both high and low GC content. Increased genome contiguity dramatically improved the annotation of insertion sequences and secondary metabolite biosynthetic gene clusters, likely because long-reads can disambiguate these highly repetitive but biologically important genomic regions. CONCLUSIONS:Genome assembly using short-reads is challenged by repetitive sequences and extreme GC contents. Our results indicate that these difficulties can be largely overcome by using single-molecule, long-read sequencing technologies such as the Oxford Nanopore MinION. Using MinION reads for assembly followed by polishing with Illumina reads generated the most contiguous genomes with sufficient accuracy to enable the accurate annotation of important but difficult to sequence genomic features such as insertion sequences and secondary metabolite biosynthetic gene clusters. The combination of Oxford Nanopore and Illumina sequencing can therefore cost-effectively advance studies of microbial evolution and genome-driven drug discovery.
Project description:Long-read sequencing technologies were launched a few years ago, and in contrast with short-read sequencing technologies, they offered a promise of solving assembly problems for large and complex genomes. Moreover by providing long-range information, it could also solve haplotype phasing. However, existing long-read technologies still have several limitations that complicate their use for most research laboratories, as well as in large and/or complex genome projects. In 2014, Oxford Nanopore released the MinION® device, a small and low-cost single-molecule nanopore sequencer, which offers the possibility of sequencing long DNA fragments.The assembly of long reads generated using the Oxford Nanopore MinION® instrument is challenging as existing assemblers were not implemented to deal with long reads exhibiting close to 30% of errors. Here, we presented a hybrid approach developed to take advantage of data generated using MinION® device. We sequenced a well-known bacterium, Acinetobacter baylyi ADP1 and applied our method to obtain a highly contiguous (one single contig) and accurate genome assembly even in repetitive regions, in contrast to an Illumina-only assembly. Our hybrid strategy was able to generate NaS (Nanopore Synthetic-long) reads up to 60 kb that aligned entirely and with no error to the reference genome and that spanned highly conserved repetitive regions. The average accuracy of NaS reads reached 99.99% without losing the initial size of the input MinION® reads.We described NaS tool, a hybrid approach allowing the sequencing of microbial genomes using the MinION® device. Our method, based ideally on 20x and 50x of NaS and Illumina reads respectively, provides an efficient and cost-effective way of sequencing microbial or small eukaryotic genomes in a very short time even in small facilities. Moreover, we demonstrated that although the Oxford Nanopore technology is a relatively new sequencing technology, currently with a high error rate, it is already useful in the generation of high-quality genome assemblies.
Project description:Long-read sequencing technologies such as Pacific Biosciences and Oxford Nanopore MinION are capable of producing long sequencing reads with average fragment lengths of over 10,000 base-pairs and maximum lengths reaching 100,000 base- pairs. Compared with short reads, the assemblies obtained from long-read sequencing platforms have much higher contig continuity and genome completeness as long fragments are able to extend paths into problematic or repetitive regions. Many successful assembly applications of the Pacific Biosciences technology have been reported ranging from small bacterial genomes to large plant and animal genomes. Recently, genome assemblies using Oxford Nanopore MinION data have attracted much attention due to the portability and low cost of this novel sequencing instrument. In this paper, we re-sequenced a well characterized genome, the Saccharomyces cerevisiae S288C strain using three different platforms: MinION, PacBio and MiSeq. We present a comprehensive metric comparison of assemblies generated by various pipelines and discuss how the platform associated data characteristics affect the assembly quality. With a given read depth of 31X, the assemblies from both Pacific Biosciences and Oxford Nanopore MinION show excellent continuity and completeness for the 16 nuclear chromosomes, but not for the mitochondrial genome, whose reconstruction still represents a significant challenge.
Project description:Monitoring the progress of DNA molecules through a membrane pore has been postulated as a method for sequencing DNA for several decades. Recently, a nanopore-based sequencing instrument, the Oxford Nanopore MinION, has become available, and we used this for sequencing the Saccharomyces cerevisiae genome. To make use of these data, we developed a novel open-source hybrid error correction algorithm Nanocorr specifically for Oxford Nanopore reads, because existing packages were incapable of assembling the long read lengths (5-50 kbp) at such high error rates (between ∼5% and 40% error). With this new method, we were able to perform a hybrid error correction of the nanopore reads using complementary MiSeq data and produce a de novo assembly that is highly contiguous and accurate: The contig N50 length is more than ten times greater than an Illumina-only assembly (678 kb versus 59.9 kbp) and has >99.88% consensus identity when compared to the reference. Furthermore, the assembly with the long nanopore reads presents a much more complete representation of the features of the genome and correctly assembles gene cassettes, rRNAs, transposable elements, and other genomic features that were almost entirely absent in the Illumina-only assembly.
Project description:Illumina sequencing platforms have enabled widespread bacterial whole genome sequencing. While Illumina data is appropriate for many analyses, its short read length limits its ability to resolve genomic structure. This has major implications for tracking the spread of mobile genetic elements, including those which carry antimicrobial resistance determinants. Fully resolving a bacterial genome requires long-read sequencing such as those generated by Oxford Nanopore Technologies (ONT) platforms. Here we describe our use of the ONT MinION to sequence 12 isolates of Klebsiella pneumoniae on a single flow cell. We assembled each genome using a combination of ONT reads and previously available Illumina reads, and little to no manual intervention was needed to achieve fully resolved assemblies using the Unicycler hybrid assembler. Assembling only ONT reads with Canu was less effective, resulting in fewer resolved genomes and higher error rates even following error correction with Nanopolish. We demonstrate that multiplexed ONT sequencing is a valuable tool for high-throughput bacterial genome finishing. Specifically, we advocate the use of Illumina sequencing as a first analysis step, followed by ONT reads as needed to resolve genomic structure.
Project description:BACKGROUND:Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. RESULTS:Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. CONCLUSION:Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology.
Project description:The sequencing, assembly, and analysis of bacterial genomes is central to tracking and characterizing foodborne pathogens. The bulk of bacterial genome sequencing at the US Food and Drug Administration is performed using short-read Illumina MiSeq technology, resulting in highly accurate but fragmented genomic sequences. The MinION sequencer from Oxford Nanopore is an evolving technology that produces long-read sequencing data with low equipment cost. The goal of this study was to compare Campylobacter genome assemblies generated from MiSeq and MinION data independently, as well as hybrid genome assemblies combining both data types. Two reference strains and two field isolates of C. jejuni were sequenced using MiSeq and MinION, and the sequence data were assembled using the software programs SPAdes and Canu, respectively. Hybrid genome assembly was performed using the program Unicycler. Comparison of the C. jejuni 81-176 and RM1221 genome assemblies to the PacBio reference genomes revealed that the SPAdes assemblies had the most accurate nucleotide identity, while the hybrid assemblies were the most contiguous. Assemblies generated only from MinION data using Canu were the least accurate, containing many indels and substitutions that affected downstream analyses. The hybrid sequencing approach was the most useful for detecting plasmids, large genome rearrangements, and repetitive elements such as rRNA and tRNA genes. The full genomes of both C. jejuni field isolates were completed and circularized using hybrid sequencing, and a plasmid was detected in one isolate. Continued development of nanopore sequencing technologies will likely enhance the accuracy of hybrid genome assemblies and enable public health laboratories to routinely generate complete circularized bacterial genome sequences.
Project description:The Oxford Nanopore Technologies (ONT) MinION is a new sequencing technology that potentially offers read lengths of tens of kilobases (kb) limited only by the length of DNA molecules presented to it. The device has a low capital cost, is by far the most portable DNA sequencer available, and can produce data in real-time. It has numerous prospective applications including improving genome sequence assemblies and resolution of repeat-rich regions. Before such a technology is widely adopted, it is important to assess its performance and limitations in respect of throughput and accuracy. In this study we assessed the performance of the MinION by re-sequencing three bacterial genomes, with very different nucleotide compositions ranging from 28.6% to 70.7%; the high G + C strain was underrepresented in the sequencing reads. We estimate the error rate of the MinION (after base calling) to be 38.2%. Mean and median read lengths were 2 kb and 1 kb respectively, while the longest single read was 98 kb. The whole length of a 5 kb rRNA operon was covered by a single read. As the first nanopore-based single molecule sequencer available to researchers, the MinION is an exciting prospect; however, the current error rate limits its ability to compete with existing sequencing technologies, though we do show that MinION sequence reads can enhance contiguity of de novo assembly when used in conjunction with Illumina MiSeq data.
Project description:Background: The introduction of the MinION sequencing device by Oxford Nanopore Technologies may greatly accelerate whole genome sequencing. Nanopore sequence data offers great potential for de novo assembly of complex genomes without using other technologies. Furthermore, Nanopore data combined with other sequencing technologies is highly useful for accurate annotation of all genes in the genome. In this manuscript we used nanopore sequencing as a tool to classify yeast strains. Methods: We compared various technical and software developments for the nanopore sequencing protocol, showing that the R9 chemistry is, as predicted, higher in quality than R7.3 chemistry. The R9 chemistry is an essential improvement for assembly of the extremely AT-rich mitochondrial genome. We double corrected assemblies from four different assemblers with PILON and assessed sequence correctness before and after PILON correction with a set of 290 Fungi genes using BUSCO. Results: In this study, we used this new technology to sequence and de novo assemble the genome of a recently isolated ethanologenic yeast strain, and compared the results with those obtained by classical Illumina short read sequencing. This strain was originally named Candida vartiovaarae ( Torulopsis vartiovaarae) based on ribosomal RNA sequencing. We show that the assembly using nanopore data is much more contiguous than the assembly using short read data. We also compared various technical and software developments for the nanopore sequencing protocol, showing that nanopore-derived assemblies provide the highest contiguity. Conclusions: The mitochondrial and chromosomal genome sequences showed that our strain is clearly distinct from other yeast taxons and most closely related to published Cyberlindnera species. In conclusion, MinION-mediated long read sequencing can be used for high quality de novo assembly of new eukaryotic microbial genomes.
Project description:Background:The Oxford Nanopore Technologies MinION(TM) is a mobile DNA sequencer that can produce long read sequences with a short turn-around time. Here we report the first demonstration of single contig genome assembly using Oxford Nanopore native barcoding when applied to a multiplexed library of 12 samples and combined with existing Illumina short read data. This paves the way for the closure of multiple bacterial genomes from a single MinION(TM) sequencing run, given the availability of existing short read data. The strain we used, MHO_001, represents the important community-acquired methicillin-resistant Staphylococcus aureus lineage USA300. Findings:Using a hybrid assembly of existing short read and barcoded long read sequences from multiplexed data, we completed a genome of the S. aureus USA300 strain MHO_001. The long read data represented only ?5% to 10% of an average MinION(TM) run (?7x genomic coverage), but, using standard tools, this was sufficient to complete the circular chromosome of S. aureus strain MHO_001 (2.86 Mb) and two complete plasmids (27 Kb and 3 Kb). Minor differences were noted when compared to USA300 reference genome, USA300_FPR3757, including the translocation, loss, and gain of mobile genetic elements. Conclusion:Here we demonstrate that MinION(TM) reads, multiplexed using native barcoding, can be used in combination with short read data to fully complete a bacterial genome. The ability to complete multiple genomes, for which short read data is already available, from a single MinION(TM) run is set to impact our understanding of accessory genome content, plasmid diversity, and genome rearrangements.