Unknown

Dataset Information

0

Analysis of nanopore data using hidden Markov models.


ABSTRACT:

Motivation

Nanopore-based sequencing techniques can reconstruct properties of biosequences by analyzing the sequence-dependent ionic current steps produced as biomolecules pass through a pore. Typically this involves alignment of new data to a reference, where both reference construction and alignment have been performed by hand.

Results

We propose an automated method for aligning nanopore data to a reference through the use of hidden Markov models. Several features that arise from prior processing steps and from the class of enzyme used can be simply incorporated into the model. Previously, the M2MspA nanopore was shown to be sensitive enough to distinguish between cytosine, methylcytosine and hydroxymethylcytosine. We validated our automated methodology on a subset of that data by automatically calculating an error rate for the distinction between the three cytosine variants and show that the automated methodology produces a 2-3% error rate, lower than the 10% error rate from previous manual segmentation and alignment.

Availability and implementation

The data, output, scripts and tutorials replicating the analysis are available at https://github.com/UCSCNanopore/Data/tree/master/Automation.

SUBMITTER: Schreiber J 

PROVIDER: S-EPMC4553831 | biostudies-literature | 2015 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Analysis of nanopore data using hidden Markov models.

Schreiber Jacob J   Karplus Kevin K  

Bioinformatics (Oxford, England) 20150203 12


<h4>Motivation</h4>Nanopore-based sequencing techniques can reconstruct properties of biosequences by analyzing the sequence-dependent ionic current steps produced as biomolecules pass through a pore. Typically this involves alignment of new data to a reference, where both reference construction and alignment have been performed by hand.<h4>Results</h4>We propose an automated method for aligning nanopore data to a reference through the use of hidden Markov models. Several features that arise fro  ...[more]

Similar Datasets

| S-EPMC2883304 | biostudies-literature
| S-EPMC8097282 | biostudies-literature
| S-EPMC2770071 | biostudies-literature
| S-EPMC5097710 | biostudies-literature
| S-EPMC8902814 | biostudies-literature
| S-EPMC11324238 | biostudies-literature
| S-EPMC4251969 | biostudies-literature
| S-EPMC3676395 | biostudies-literature
| S-EPMC5832732 | biostudies-literature
| S-EPMC3356369 | biostudies-literature