Unknown

Dataset Information

0

FindAdapt: A python package for fast and accurate adapter detection in small RNA sequencing.


ABSTRACT: Adapter trimming is an essential step for analyzing small RNA sequencing data, where reads are generally longer than target RNAs ranging from 18 to 30 bp. Most adapter trimming tools require adapter information as input. However, adapter information is hard to access, specified incorrectly, or not provided with publicly available datasets, hampering their reproducibility and reusability. Manual identification of adapter patterns from raw reads is labor-intensive and error-prone. Moreover, the use of randomized adapters to reduce ligation biases during library preparation makes adapter detection even more challenging. Here, we present FindAdapt, a Python package for fast and accurate detection of adapter patterns without relying on prior information. We demonstrated that FindAdapt was far superior to existing approaches. It identified adapters successfully in 180 simulation datasets with diverse read structures and 3,184 real datasets covering a variety of commercial and customized small RNA library preparation kits. FindAdapt is stand-alone software that can be easily integrated into small RNA sequencing analysis pipelines.

SUBMITTER: Chen HC 

PROVIDER: S-EPMC10833567 | biostudies-literature | 2024 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

FindAdapt: A python package for fast and accurate adapter detection in small RNA sequencing.

Chen Hua-Chang HC   Wang Jing J   Shyr Yu Y   Liu Qi Q  

PLoS computational biology 20240122 1


Adapter trimming is an essential step for analyzing small RNA sequencing data, where reads are generally longer than target RNAs ranging from 18 to 30 bp. Most adapter trimming tools require adapter information as input. However, adapter information is hard to access, specified incorrectly, or not provided with publicly available datasets, hampering their reproducibility and reusability. Manual identification of adapter patterns from raw reads is labor-intensive and error-prone. Moreover, the us  ...[more]

Similar Datasets

| S-EPMC4074385 | biostudies-literature
| S-EPMC11471259 | biostudies-literature
| S-EPMC6958438 | biostudies-literature
| S-EPMC8168212 | biostudies-literature
| S-EPMC5063419 | biostudies-literature
| S-EPMC6454532 | biostudies-literature
| S-EPMC9883683 | biostudies-literature
| S-EPMC6101086 | biostudies-literature
| S-EPMC7214040 | biostudies-literature
| S-EPMC10038132 | biostudies-literature