Unknown

Dataset Information

0

De novo sequence assembly requires bioinformatic checking of chimeric sequences.


ABSTRACT: De novo assembly of sequence reads from next generation sequencing platforms is a common strategy for detecting presence and sequencing of viruses in biospecimens. Amplification artifacts and presence of several related viruses in the same specimen can lead to assembly of erroneous, chimeric sequences. We now report that such chimeras can also occur between viral and non-viral biological sequences incorrectly joined together which may cause erroneous detection of viruses, highlighting the importance of performing a chimera checking step in bioinformatics pipelines. Using Illumina NextSeq and metagenomic sequencing, we analyzed 80 consecutive non-melanoma skin cancers (NMSCs) from 11 immunosuppressed patients together with 11 NMSCs from patients who had only developed 1 NMSC. We aligned high-quality reads against a Human Papillomavirus (HPV) database and found HPV sequences in 9/91 specimens. A previous bioinformatic analysis of the same crude sequencing data from some of these samples had found an additional 3 specimens to be HPV-positive after performing de novo assembly. The reason for the discrepancy was investigated and found to be mostly caused by chimeric sequences containing both viral and non-viral sequences. Non-viral sequences were present in these 3 samples. To avoid erroneous detection of HPV when performing sequencing, we thus developed a novel script to identify HPV chimeric sequences.

SUBMITTER: Arroyo Muhr LS 

PROVIDER: S-EPMC7417191 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

altmetric image

Publications

De novo sequence assembly requires bioinformatic checking of chimeric sequences.

Arroyo Mühr Laila Sara LS   Lagheden Camilla C   Hassan Sadaf Sakina SS   Kleppe Sara Nordqvist SN   Hultin Emilie E   Dillner Joakim J  

PloS one 20200810 8


De novo assembly of sequence reads from next generation sequencing platforms is a common strategy for detecting presence and sequencing of viruses in biospecimens. Amplification artifacts and presence of several related viruses in the same specimen can lead to assembly of erroneous, chimeric sequences. We now report that such chimeras can also occur between viral and non-viral biological sequences incorrectly joined together which may cause erroneous detection of viruses, highlighting the import  ...[more]

Similar Datasets

| S-EPMC4999880 | biostudies-other
| S-EPMC1289410 | biostudies-literature
| S-EPMC3272011 | biostudies-literature
2016-06-07 | MSV000079801 | MassIVE
| S-EPMC4403674 | biostudies-literature
| S-EPMC6326164 | biostudies-literature
| S-EPMC4326031 | biostudies-literature
| S-EPMC6362891 | biostudies-literature
| S-EPMC4169894 | biostudies-literature
| S-EPMC7039982 | biostudies-literature