Unknown

Dataset Information

0

Expanded functionality, increased accuracy, and enhanced speed in the de novo genotyping-by-sequencing pipeline GBS-SNP-CROP.


ABSTRACT:

Summary

GBS-SNP-CROP is a bioinformatics pipeline originally developed to support the cost-effective genome-wide characterization of plant genetic resources through paired-end genotyping-by-sequencing (GBS), particularly in the absence of a reference genome. Since its 2016 release, the pipeline's functionality has greatly expanded, its computational efficiency has improved, and its applicability to a broad set of genomic studies for both plants and animals has been demonstrated. This note details the suite of improvements to date, as realized in GBS-SNP-CROP v.4.0, with specific attention paid to a new integrated metric that facilitates reliable variant identification despite the complications of homologs. Using the new de novo GBS read simulator GBS-Pacecar, also introduced in this note, results show an improvement in overall pipeline accuracy from 66% (v.1.0) to 84% (v.4.0), with a time saving of ∼70%. Both GBS-SNP-CROP versions significantly outperform TASSEL-UNEAK; and v.4.0 resolves the issue of non-overlapping variant calls observed between UNEAK and v.1.0.

Availability and implementation

GBS-SNP-CROP source code and user manual are available at https://github.com/halelab/GBS-SNP-CROP. The GBS read simulator GBS-Pacecar is available at https://github.com/halelab/GBS-Pacecar.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Melo ATO 

PROVIDER: S-EPMC6513162 | biostudies-literature | 2019 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Expanded functionality, increased accuracy, and enhanced speed in the de novo genotyping-by-sequencing pipeline GBS-SNP-CROP.

Melo Arthur T O ATO   Hale Iago I  

Bioinformatics (Oxford, England) 20190501 10


<h4>Summary</h4>GBS-SNP-CROP is a bioinformatics pipeline originally developed to support the cost-effective genome-wide characterization of plant genetic resources through paired-end genotyping-by-sequencing (GBS), particularly in the absence of a reference genome. Since its 2016 release, the pipeline's functionality has greatly expanded, its computational efficiency has improved, and its applicability to a broad set of genomic studies for both plants and animals has been demonstrated. This not  ...[more]

Similar Datasets

| S-EPMC4709900 | biostudies-literature
| S-EPMC3938676 | biostudies-literature
| S-EPMC3553054 | biostudies-literature
| S-EPMC2703893 | biostudies-literature
| S-EPMC6322752 | biostudies-literature
| S-EPMC6291945 | biostudies-literature
| S-EPMC5210301 | biostudies-literature
| S-EPMC6435650 | biostudies-literature
| S-EPMC4993469 | biostudies-literature
| S-EPMC8142480 | biostudies-literature