Unknown

Dataset Information

0

Fastq2vcf: a concise and transparent pipeline for whole-exome sequencing data analyses.


ABSTRACT: Whole-exome sequencing (WES) is a popular next-generation sequencing technology used by numerous laboratories with various levels of statistical and analytical expertise. Centralized databases, such as the Sequence Read Archive and the European Nucleotide Archive, allow data to be reanalyzed by independent labs to confirm results and derive additional insights. Access to new and shared data highlights the necessity for software that both lowers the statistical and analytical expertise required to generate results and promotes reproducible methodology among laboratories.We have developed fastq2vcf, a pipeline that automates the genomic variant calling process using multiple callers. Fastq2vcf offers improved flexibility, efficiency, and reproducibility by seamlessly integrating several leading sequencing analysis tools. It outputs not only the annotated variant call set for each caller, but also the consensus variant call set shared by different callers. Furthermore, it can be customized and extended easily.Our software tool automatically generates executable command lines for a variety of tools required for analyzing WES data. It is also highly configurable and provides users with complete control of the processing procedure, making it easy to submit and track jobs in both single workstation and parallelized computing environments. By using this pipeline, WES analysis can be easily reproduced.

SUBMITTER: Gao X 

PROVIDER: S-EPMC4376134 | biostudies-literature | 2015 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Fastq2vcf: a concise and transparent pipeline for whole-exome sequencing data analyses.

Gao Xiaoyi X   Xu Jianpeng J   Starmer Joshua J  

BMC research notes 20150308


<h4>Background</h4>Whole-exome sequencing (WES) is a popular next-generation sequencing technology used by numerous laboratories with various levels of statistical and analytical expertise. Centralized databases, such as the Sequence Read Archive and the European Nucleotide Archive, allow data to be reanalyzed by independent labs to confirm results and derive additional insights. Access to new and shared data highlights the necessity for software that both lowers the statistical and analytical e  ...[more]

Similar Datasets

| S-EPMC4253833 | biostudies-other
| S-EPMC4929867 | biostudies-other
| S-EPMC7874478 | biostudies-literature
| S-EPMC4102467 | biostudies-literature
| S-EPMC5549930 | biostudies-other
| S-EPMC4585643 | biostudies-literature
| S-EPMC5818140 | biostudies-literature
| S-EPMC3411592 | biostudies-literature
| S-EPMC3362899 | biostudies-other
| S-EPMC4169871 | biostudies-other