Unknown

Dataset Information

0

Ultra-fast local-haplotype variant calling using paired-end DNA-sequencing data reveals somatic mosaicism in tumor and normal blood samples.


ABSTRACT: Somatic mosaicism refers to the existence of somatic mutations in a fraction of somatic cells in a single biological sample. Its importance has mainly been discussed in theory although experimental work has started to emerge linking somatic mosaicism to disease diagnosis. Through novel statistical modeling of paired-end DNA-sequencing data using blood-derived DNA from healthy donors as well as DNA from tumor samples, we present an ultra-fast computational pipeline, LocHap that searches for multiple single nucleotide variants (SNVs) that are scaffolded by the same reads. We refer to scaffolded SNVs as local haplotypes (LH). When an LH exhibits more than two genotypes, we call it a local haplotype variant (LHV). The presence of LHVs is considered evidence of somatic mosaicism because a genetically homogeneous cell population will not harbor LHVs. Applying LocHap to whole-genome and whole-exome sequence data in DNA from normal blood and tumor samples, we find wide-spread LHVs across the genome. Importantly, we find more LHVs in tumor samples than in normal samples, and more in older adults than in younger ones. We confirm the existence of LHVs and somatic mosaicism by validation studies in normal blood samples. LocHap is publicly available at http://www.compgenome.org/lochap.

SUBMITTER: Sengupta S 

PROVIDER: S-EPMC4756850 | biostudies-literature | 2016 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Ultra-fast local-haplotype variant calling using paired-end DNA-sequencing data reveals somatic mosaicism in tumor and normal blood samples.

Sengupta Subhajit S   Gulukota Kamalakar K   Zhu Yitan Y   Ober Carole C   Naughton Katherine K   Wentworth-Sheilds William W   Ji Yuan Y  

Nucleic acids research 20150929 3


Somatic mosaicism refers to the existence of somatic mutations in a fraction of somatic cells in a single biological sample. Its importance has mainly been discussed in theory although experimental work has started to emerge linking somatic mosaicism to disease diagnosis. Through novel statistical modeling of paired-end DNA-sequencing data using blood-derived DNA from healthy donors as well as DNA from tumor samples, we present an ultra-fast computational pipeline, LocHap that searches for multi  ...[more]

Similar Datasets

| S-EPMC3907544 | biostudies-literature
| S-EPMC6211471 | biostudies-literature
| S-EPMC5584341 | biostudies-literature
| S-DIXA-D-1103 | biostudies-other
| S-EPMC7484502 | biostudies-literature
| S-EPMC3259434 | biostudies-literature
| S-EPMC5834899 | biostudies-literature
| S-EPMC8756192 | biostudies-literature