Unknown

Dataset Information

0

Optimal assembly for high throughput shotgun sequencing.


ABSTRACT: We present a framework for the design of optimal assembly algorithms for shotgun sequencing under the criterion of complete reconstruction. We derive a lower bound on the read length and the coverage depth required for reconstruction in terms of the repeat statistics of the genome. Building on earlier works, we design a de Brujin graph based assembly algorithm which can achieve very close to the lower bound for repeat statistics of a wide range of sequenced genomes, including the GAGE datasets. The results are based on a set of necessary and sufficient conditions on the DNA sequence and the reads for reconstruction. The conditions can be viewed as the shotgun sequencing analogue of Ukkonen-Pevzner's necessary and sufficient conditions for Sequencing by Hybridization.

SUBMITTER: Bresler G 

PROVIDER: S-EPMC3706340 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

altmetric image

Publications

Optimal assembly for high throughput shotgun sequencing.

Bresler Guy G   Bresler Ma'ayan M   Tse David D  

BMC bioinformatics 20130709


We present a framework for the design of optimal assembly algorithms for shotgun sequencing under the criterion of complete reconstruction. We derive a lower bound on the read length and the coverage depth required for reconstruction in terms of the repeat statistics of the genome. Building on earlier works, we design a de Brujin graph based assembly algorithm which can achieve very close to the lower bound for repeat statistics of a wide range of sequenced genomes, including the GAGE datasets.  ...[more]

Similar Datasets

| S-EPMC3938818 | biostudies-literature
| S-EPMC5805770 | biostudies-literature
| S-EPMC3494147 | biostudies-literature
| S-EPMC6034344 | biostudies-literature
| S-EPMC4178013 | biostudies-literature
| S-EPMC7671303 | biostudies-literature
| S-EPMC4695645 | biostudies-literature
| S-EPMC3119603 | biostudies-literature