Unknown

Dataset Information

0

HiFiAdapterFilt, a memory efficient read processing pipeline, prevents occurrence of adapter sequence in PacBio HiFi reads and their negative impacts on genome assembly.


ABSTRACT:

Background

Pacific Biosciences HiFi read technology is currently the industry standard for high accuracy long-read sequencing that has been widely adopted by large sequencing and assembly initiatives for generation of de novo assemblies in non-model organisms. Though adapter contamination filtering is routine in traditional short-read analysis pipelines, it has not been widely adopted for HiFi workflows.

Results

Analysis of 55 publicly available HiFi datasets revealed that a read-sanitation step to remove sequence artifacts derived from PacBio library preparation from read pools is necessary as adapter sequences can be erroneously integrated into assemblies.

Conclusions

Here we describe the nature of adapter contaminated reads, their consequences in assembly, and present HiFiAdapterFilt, a simple and memory efficient solution for removing adapter contaminated reads prior to assembly.

SUBMITTER: Sim SB 

PROVIDER: S-EPMC8864876 | biostudies-literature | 2022 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

HiFiAdapterFilt, a memory efficient read processing pipeline, prevents occurrence of adapter sequence in PacBio HiFi reads and their negative impacts on genome assembly.

Sim Sheina B SB   Corpuz Renee L RL   Simmonds Tyler J TJ   Geib Scott M SM  

BMC genomics 20220222 1


<h4>Background</h4>Pacific Biosciences HiFi read technology is currently the industry standard for high accuracy long-read sequencing that has been widely adopted by large sequencing and assembly initiatives for generation of de novo assemblies in non-model organisms. Though adapter contamination filtering is routine in traditional short-read analysis pipelines, it has not been widely adopted for HiFi workflows.<h4>Results</h4>Analysis of 55 publicly available HiFi datasets revealed that a read-  ...[more]

Similar Datasets

| S-EPMC9633807 | biostudies-literature
| S-EPMC10776742 | biostudies-literature
| S-EPMC11358474 | biostudies-literature
| S-EPMC11604747 | biostudies-literature
| S-EPMC11569603 | biostudies-literature
| S-EPMC10744512 | biostudies-literature
| S-EPMC10338486 | biostudies-literature
| S-EPMC9943720 | biostudies-literature
| S-EPMC8281077 | biostudies-literature
| S-EPMC8903783 | biostudies-literature