Unknown

Dataset Information

0

Parsimonious higher-order hidden Markov models for improved array-CGH analysis with applications to Arabidopsis thaliana.


ABSTRACT: Array-based comparative genomic hybridization (Array-CGH) is an important technology in molecular biology for the detection of DNA copy number polymorphisms between closely related genomes. Hidden Markov Models (HMMs) are popular tools for the analysis of Array-CGH data, but current methods are only based on first-order HMMs having constrained abilities to model spatial dependencies between measurements of closely adjacent chromosomal regions. Here, we develop parsimonious higher-order HMMs enabling the interpolation between a mixture model ignoring spatial dependencies and a higher-order HMM exhaustively modeling spatial dependencies. We apply parsimonious higher-order HMMs to the analysis of Array-CGH data of the accessions C24 and Col-0 of the model plant Arabidopsis thaliana. We compare these models against first-order HMMs and other existing methods using a reference of known deletions and sequence deviations. We find that parsimonious higher-order HMMs clearly improve the identification of these polymorphisms. Moreover, we perform a functional analysis of identified polymorphisms revealing novel details of genomic differences between C24 and Col-0. Additional model evaluations are done on widely considered Array-CGH data of human cell lines indicating that parsimonious HMMs are also well-suited for the analysis of non-plant specific data. All these results indicate that parsimonious higher-order HMMs are useful for Array-CGH analyses. An implementation of parsimonious higher-order HMMs is available as part of the open source Java library Jstacs (www.jstacs.de/index.php/PHHMM).

SUBMITTER: Seifert M 

PROVIDER: S-EPMC3257270 | biostudies-literature | 2012 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Parsimonious higher-order hidden Markov models for improved array-CGH analysis with applications to Arabidopsis thaliana.

Seifert Michael M   Gohr André A   Strickert Marc M   Grosse Ivo I  

PLoS computational biology 20120112 1


Array-based comparative genomic hybridization (Array-CGH) is an important technology in molecular biology for the detection of DNA copy number polymorphisms between closely related genomes. Hidden Markov Models (HMMs) are popular tools for the analysis of Array-CGH data, but current methods are only based on first-order HMMs having constrained abilities to model spatial dependencies between measurements of closely adjacent chromosomal regions. Here, we develop parsimonious higher-order HMMs enab  ...[more]

Similar Datasets

| S-EPMC3286622 | biostudies-literature
| S-EPMC3052263 | biostudies-literature
| S-EPMC8044428 | biostudies-literature
| S-EPMC6851437 | biostudies-literature
| S-EPMC2998602 | biostudies-literature
| S-EPMC4814833 | biostudies-other
| S-EPMC6355111 | biostudies-literature
| S-EPMC6660659 | biostudies-literature
| S-EPMC6504179 | biostudies-literature