Unknown

Dataset Information

0

Probabilistic inference of viral quasispecies subject to recombination.


ABSTRACT: RNA viruses exist in their hosts as populations of different but related strains. The virus population, often called quasispecies, is shaped by a combination of genetic change and natural selection. Genetic change is due to both point mutations and recombination events. We present a jumping hidden Markov model that describes the generation of viral quasispecies and a method to infer its parameters from next-generation sequencing data. The model introduces position-specific probability tables over the sequence alphabet to explain the diversity that can be found in the population at each site. Recombination events are indicated by a change of state, allowing a single observed read to originate from multiple sequences. We present a specific implementation of the expectation maximization (EM) algorithm to find maximum a posteriori estimates of the model parameters and a method to estimate the distribution of viral strains in the quasispecies. The model is validated on simulated data, showing the advantage of explicitly taking the recombination process into account, and applied to reads obtained from a clinical HIV sample.

SUBMITTER: Topfer A 

PROVIDER: S-EPMC3576916 | biostudies-literature | 2013 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Probabilistic inference of viral quasispecies subject to recombination.

Töpfer Armin A   Zagordi Osvaldo O   Prabhakaran Sandhya S   Roth Volker V   Halperin Eran E   Beerenwinkel Niko N  

Journal of computational biology : a journal of computational molecular cell biology 20130201 2


RNA viruses exist in their hosts as populations of different but related strains. The virus population, often called quasispecies, is shaped by a combination of genetic change and natural selection. Genetic change is due to both point mutations and recombination events. We present a jumping hidden Markov model that describes the generation of viral quasispecies and a method to infer its parameters from next-generation sequencing data. The model introduces position-specific probability tables ove  ...[more]

Similar Datasets

| S-EPMC4234478 | biostudies-literature
| S-EPMC3372249 | biostudies-other
| S-EPMC6797082 | biostudies-literature
| S-EPMC2921378 | biostudies-literature
| S-EPMC6492297 | biostudies-literature
| S-EPMC5527101 | biostudies-literature
| S-EPMC3274721 | biostudies-literature
| S-EPMC1090553 | biostudies-literature