Unknown

Dataset Information

0

GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts.


ABSTRACT: Summary:We present an updated version of our computational pipeline, PathSeq, for the discovery and identification of microbial sequences in genomic and transcriptomic libraries from eukaryotic hosts. This pipeline is available in the Genome Analysis Toolkit (GATK) as a suite of configurable tools that can report the microbial composition of DNA or RNA short-read sequencing samples and identify unknown sequences for downstream assembly of novel organisms. GATK PathSeq enables sample analysis in minutes at low cost. In addition, these tools are built with the GATK engine and Apache Spark framework, providing robust, rapid parallelization of read quality filtering, host subtraction and microbial alignment in workstation, cluster and cloud environments. Availability and implementation:These tools are available as a part of the GATK at https://github.com/broadinstitute/gatk. Supplementary information:Supplementary data are available at Bioinformatics online.

SUBMITTER: Walker MA 

PROVIDER: S-EPMC6289130 | biostudies-literature | 2018 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

GATK PathSeq: a customizable computational tool for the discovery and identification of microbial sequences in libraries from eukaryotic hosts.

Walker Mark A MA   Pedamallu Chandra Sekhar CS   Ojesina Akinyemi I AI   Bullman Susan S   Sharpe Ted T   Whelan Christopher W CW   Meyerson Matthew M  

Bioinformatics (Oxford, England) 20181201 24


<h4>Summary</h4>We present an updated version of our computational pipeline, PathSeq, for the discovery and identification of microbial sequences in genomic and transcriptomic libraries from eukaryotic hosts. This pipeline is available in the Genome Analysis Toolkit (GATK) as a suite of configurable tools that can report the microbial composition of DNA or RNA short-read sequencing samples and identify unknown sequences for downstream assembly of novel organisms. GATK PathSeq enables sample anal  ...[more]

Similar Datasets

| S-EPMC10619229 | biostudies-literature
| S-EPMC551570 | biostudies-literature
| S-EPMC11665472 | biostudies-literature
| S-EPMC9692083 | biostudies-literature
| S-EPMC4053730 | biostudies-literature
| S-EPMC152451 | biostudies-literature
| S-EPMC3234254 | biostudies-literature
| S-EPMC11536983 | biostudies-literature
| S-EPMC8346489 | biostudies-literature
| S-EPMC10904014 | biostudies-literature