Unknown

Dataset Information

0

Detecting copy number variations from array CGH data based on a conditional random field model.


ABSTRACT: Array comparative genomic hybridization (aCGH) allows identification of copy number alterations across genomes. The key computational challenge in analyzing copy number variations (CNVs) using aCGH data or other similar data generated by a variety of array technologies is the detection of segment boundaries of copy number changes and inference of the copy number state for each segment. We have developed a novel statistical model based on the framework of conditional random fields (CRFs) that can effectively combine data smoothing, segmentation and copy number state decoding into one unified framework. Our approach (termed CRF-CNV) provides great flexibilities in defining meaningful feature functions. Therefore, it can effectively integrate local spatial information of arbitrary sizes into the model. For model parameter estimations, we have adopted the conjugate gradient (CG) method for likelihood optimization and developed efficient forward/backward algorithms within the CG framework. The method is evaluated using real data with known copy numbers as well as simulated data with realistic assumptions, and compared with two popular publicly available programs. Experimental results have demonstrated that CRF-CNV outperforms a Bayesian Hidden Markov Model-based approach on both datasets in terms of copy number assignments. Comparing to a non-parametric approach, CRF-CNV has achieved much greater precision while maintaining the same level of recall on the real data, and their performance on the simulated data is comparable.

SUBMITTER: Yin XL 

PROVIDER: S-EPMC3326659 | biostudies-literature | 2010 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Detecting copy number variations from array CGH data based on a conditional random field model.

Yin Xiao-Lin XL   Li Jing J  

Journal of bioinformatics and computational biology 20100401 2


Array comparative genomic hybridization (aCGH) allows identification of copy number alterations across genomes. The key computational challenge in analyzing copy number variations (CNVs) using aCGH data or other similar data generated by a variety of array technologies is the detection of segment boundaries of copy number changes and inference of the copy number state for each segment. We have developed a novel statistical model based on the framework of conditional random fields (CRFs) that can  ...[more]

Similar Datasets

| S-EPMC3079218 | biostudies-literature
2010-12-31 | GSE21387 | GEO
| S-EPMC4154476 | biostudies-literature
| S-EPMC3907382 | biostudies-literature
| S-EPMC3573951 | biostudies-literature
2010-12-31 | E-GEOD-21387 | biostudies-arrayexpress
| S-EPMC4478553 | biostudies-literature
| S-EPMC2978932 | biostudies-literature
| S-EPMC1994778 | biostudies-literature
| S-EPMC3158569 | biostudies-literature