Dataset Information


Ultra-fast local-haplotype variant calling using paired-end DNA-sequencing data reveals somatic mosaicism in tumor and normal blood samples.

ABSTRACT: Somatic mosaicism refers to the existence of somatic mutations in a fraction of somatic cells in a single biological sample. Its importance has mainly been discussed in theory although experimental work has started to emerge linking somatic mosaicism to disease diagnosis. Through novel statistical modeling of paired-end DNA-sequencing data using blood-derived DNA from healthy donors as well as DNA from tumor samples, we present an ultra-fast computational pipeline, LocHap that searches for multiple single nucleotide variants (SNVs) that are scaffolded by the same reads. We refer to scaffolded SNVs as local haplotypes (LH). When an LH exhibits more than two genotypes, we call it a local haplotype variant (LHV). The presence of LHVs is considered evidence of somatic mosaicism because a genetically homogeneous cell population will not harbor LHVs. Applying LocHap to whole-genome and whole-exome sequence data in DNA from normal blood and tumor samples, we find wide-spread LHVs across the genome. Importantly, we find more LHVs in tumor samples than in normal samples, and more in older adults than in younger ones. We confirm the existence of LHVs and somatic mosaicism by validation studies in normal blood samples. LocHap is publicly available at http://www.compgenome.org/lochap.

SUBMITTER: Sengupta S 

PROVIDER: S-EPMC4756850 | BioStudies | 2016-01-01

REPOSITORIES: biostudies

Similar Datasets

2017-01-01 | S-EPMC5378170 | BioStudies
2015-01-01 | S-EPMC4573927 | BioStudies
2010-01-01 | S-EPMC2896781 | BioStudies
2019-01-01 | S-EPMC6323907 | BioStudies
1000-01-01 | S-EPMC3739940 | BioStudies
2015-01-01 | S-EPMC4664477 | BioStudies
1000-01-01 | S-EPMC1735296 | BioStudies
2019-01-01 | S-EPMC6660700 | BioStudies
2009-01-01 | S-EPMC2729699 | BioStudies
1000-01-01 | S-EPMC4482059 | BioStudies