Unknown

Dataset Information

0

HsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation.


ABSTRACT:

Background

Somatic copy number alternation (SCNA) is a common feature of the cancer genome and is associated with cancer etiology and prognosis. The allele-specific SCNA analysis of a tumor sample aims to identify the allele-specific copy numbers of both alleles, adjusting for the ploidy and the tumor purity. Next generation sequencing platforms produce abundant read counts at the base-pair resolution across the exome or whole genome which is susceptible to hypersegmentation, a phenomenon where numerous regions with very short length are falsely identified as SCNA.

Results

We propose hsegHMM, a hidden Markov model approach that accounts for hypersegmentation for allele-specific SCNA analysis. hsegHMM provides statistical inference of copy number profiles by using an efficient E-M algorithm procedure. Through simulation and application studies, we found that hsegHMM handles hypersegmentation effectively with a t-distribution as a part of the emission probability distribution structure and a carefully defined state space. We also compared hsegHMM with FACETS which is a current method for allele-specific SCNA analysis. For the application, we use a renal cell carcinoma sample from The Cancer Genome Atlas (TCGA) study.

Conclusions

We demonstrate the robustness of hsegHMM to hypersegmentation. Furthermore, hsegHMM provides the quantification of uncertainty in identifying allele-specific SCNAs over the entire chromosomes. hsegHMM performs better than FACETS when read depth (coverage) is uneven across the genome.

SUBMITTER: Choo-Wosoba H 

PROVIDER: S-EPMC6236906 | biostudies-literature | 2018 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

hsegHMM: hidden Markov model-based allele-specific copy number alteration analysis accounting for hypersegmentation.

Choo-Wosoba Hyoyoung H   Albert Paul S PS   Zhu Bin B  

BMC bioinformatics 20181114 1


<h4>Background</h4>Somatic copy number alternation (SCNA) is a common feature of the cancer genome and is associated with cancer etiology and prognosis. The allele-specific SCNA analysis of a tumor sample aims to identify the allele-specific copy numbers of both alleles, adjusting for the ploidy and the tumor purity. Next generation sequencing platforms produce abundant read counts at the base-pair resolution across the exome or whole genome which is susceptible to hypersegmentation, a phenomeno  ...[more]

Similar Datasets

| S-EPMC4029584 | biostudies-literature
| S-EPMC3130254 | biostudies-literature
| S-EPMC2947907 | biostudies-other
| S-EPMC3371636 | biostudies-literature
| S-EPMC4866742 | biostudies-literature
| S-EPMC1874617 | biostudies-literature
| S-EPMC4344483 | biostudies-literature
| S-EPMC9454926 | biostudies-literature
| S-EPMC3375654 | biostudies-literature
| S-EPMC2723089 | biostudies-other