Unknown

Dataset Information

0

Deconvolution of multiple infections in Plasmodium falciparum from high throughput sequencing data.


ABSTRACT: The presence of multiple infecting strains of the malarial parasite Plasmodium falciparum affects key phenotypic traits, including drug resistance and risk of severe disease. Advances in protocols and sequencing technology have made it possible to obtain high-coverage genome-wide sequencing data from blood samples and blood spots taken in the field. However, analyzing and interpreting such data is challenging because of the high rate of multiple infections present.We have developed a statistical method and implementation for deconvolving multiple genome sequences present in an individual with mixed infections. The software package DEploid uses haplotype structure within a reference panel of clonal isolates as a prior for haplotypes present in a given sample. It estimates the number of strains, their relative proportions and the haplotypes presented in a sample, allowing researchers to study multiple infection in malaria with an unprecedented level of detail.The open source implementation DEploid is freely available at https://github.com/mcveanlab/DEploid under the conditions of the GPLv3 license. An R version is available at https://github.com/mcveanlab/DEploid-r.joe.zhu@bdi.ox.ac.uk or gil.mcvean@bdi.ox.ac.uk.Supplementary data are available at Bioinformatics online.

SUBMITTER: Zhu SJ 

PROVIDER: S-EPMC5870807 | biostudies-other | 2018 Jan

REPOSITORIES: biostudies-other

altmetric image

Publications

Deconvolution of multiple infections in Plasmodium falciparum from high throughput sequencing data.

Zhu Sha Joe SJ   Almagro-Garcia Jacob J   McVean Gil G  

Bioinformatics (Oxford, England) 20180101 1


<h4>Motivation</h4>The presence of multiple infecting strains of the malarial parasite Plasmodium falciparum affects key phenotypic traits, including drug resistance and risk of severe disease. Advances in protocols and sequencing technology have made it possible to obtain high-coverage genome-wide sequencing data from blood samples and blood spots taken in the field. However, analyzing and interpreting such data is challenging because of the high rate of multiple infections present.<h4>Results<  ...[more]

Similar Datasets

| S-EPMC3483182 | biostudies-literature
| S-EPMC3738909 | biostudies-literature
| S-EPMC6604269 | biostudies-literature
| S-EPMC4818741 | biostudies-other
| S-EPMC3832420 | biostudies-literature
| S-EPMC3458526 | biostudies-other