Dataset Information


MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome Profiling™ Data.

ABSTRACT: BACKGROUND:Scaffolding is an essential step in the genome assembly process. Current methods based on large fragment paired-end reads or long reads allow an increase in contiguity but often lack consistency in repetitive regions, resulting in fragmented assemblies. Here, we describe a novel tool to link assemblies to a genome map to aid complex genome reconstruction by detecting assembly errors and allowing scaffold ordering and anchoring. RESULTS:We present MaGuS (map-guided scaffolding), a modular tool that uses a draft genome assembly, a Whole Genome Profiling™ (WGP) map, and high-throughput paired-end sequencing data to estimate the quality and to enhance the contiguity of an assembly. We generated several assemblies of the Arabidopsis genome using different scaffolding programs and applied MaGuS to select the best assembly using quality metrics. Then, we used MaGuS to perform map-guided scaffolding to increase contiguity by creating new scaffold links in low-covered and highly repetitive regions where other commonly used scaffolding methods lack consistency. CONCLUSIONS:MaGuS is a powerful reference-free evaluator of assembly quality and a WGP map-guided scaffolder that is freely available at https://github.com/institut-de-genomique/MaGuS. Its use can be extended to other high-throughput sequencing data (e.g., long-read data) and also to other map data (e.g., genetic maps) to improve the quality and the contiguity of large and complex genome assemblies.


PROVIDER: S-EPMC4776351 | BioStudies | 2016-01-01

REPOSITORIES: biostudies

Similar Datasets

2012-01-01 | S-EPMC3311077 | BioStudies
2017-01-01 | S-EPMC5500127 | BioStudies
2017-01-01 | S-EPMC5508778 | BioStudies
2020-01-01 | S-EPMC7649008 | BioStudies
2018-01-01 | S-EPMC5834899 | BioStudies
2015-01-01 | S-EPMC4587741 | BioStudies
2017-01-01 | S-EPMC5411772 | BioStudies
2018-01-01 | S-EPMC6204047 | BioStudies
2014-01-01 | S-EPMC4156343 | BioStudies
2018-01-01 | S-EPMC5967465 | BioStudies