Unknown

Dataset Information

0

Efficient genotype compression and analysis of large genetic-variation data sets.


ABSTRACT: Genotype Query Tools (GQT) is an indexing strategy that expedites analyses of genome-variation data sets in Variant Call Format based on sample genotypes, phenotypes and relationships. GQT's compressed genotype index minimizes decompression for analysis, and its performance relative to that of existing methods improves with cohort size. We show substantial (up to 443-fold) gains in performance over existing methods and demonstrate GQT's utility for exploring massive data sets involving thousands to millions of genomes. GQT can be accessed at https://github.com/ryanlayer/gqt.

SUBMITTER: Layer RM 

PROVIDER: S-EPMC4697868 | biostudies-other | 2016 Jan

REPOSITORIES: biostudies-other

altmetric image

Publications

Efficient genotype compression and analysis of large genetic-variation data sets.

Layer Ryan M RM   Kindlon Neil N   Karczewski Konrad J KJ   Karczewski Konrad J KJ   Quinlan Aaron R AR  

Nature methods 20151109 1


Genotype Query Tools (GQT) is an indexing strategy that expedites analyses of genome-variation data sets in Variant Call Format based on sample genotypes, phenotypes and relationships. GQT's compressed genotype index minimizes decompression for analysis, and its performance relative to that of existing methods improves with cohort size. We show substantial (up to 443-fold) gains in performance over existing methods and demonstrate GQT's utility for exploring massive data sets involving thousands  ...[more]

Similar Datasets

| S-EPMC5602011 | biostudies-literature
| S-EPMC5860313 | biostudies-literature
| S-EPMC3185442 | biostudies-literature
| S-EPMC3439324 | biostudies-literature
| S-EPMC3579489 | biostudies-literature
| S-EPMC2926624 | biostudies-literature
| S-EPMC8595950 | biostudies-literature
| S-EPMC3700844 | biostudies-other
| S-EPMC6218042 | biostudies-literature