Unknown

Dataset Information

0

Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation.


ABSTRACT: Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth-based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth-based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available.

SUBMITTER: Szatkiewicz JP 

PROVIDER: S-EPMC3561969 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC3640002 | biostudies-literature
| S-EPMC5777982 | biostudies-literature
| S-EPMC4021345 | biostudies-literature
| S-EPMC10762977 | biostudies-literature
| S-EPMC7447571 | biostudies-literature
| S-EPMC2752127 | biostudies-literature
| S-EPMC8612020 | biostudies-literature
| S-EPMC5018994 | biostudies-literature
| S-EPMC3571941 | biostudies-literature
| S-EPMC2878862 | biostudies-literature