Unknown

Dataset Information

0

Evaluating the performance of low-frequency variant calling tools for the detection of variants from short-read deep sequencing data.


ABSTRACT: Detection of low-frequency variants with high accuracy plays an important role in biomedical research and clinical practice. However, it is challenging to do so with next-generation sequencing (NGS) approaches due to the high error rates of NGS. To accurately distinguish low-level true variants from these errors, many statistical variants calling tools for calling low-frequency variants have been proposed, but a systematic performance comparison of these tools has not yet been performed. Here, we evaluated four raw-reads-based variant callers (SiNVICT, outLyzer, Pisces, and LoFreq) and four UMI-based variant callers (DeepSNVMiner, MAGERI, smCounter2, and UMI-VarCal) considering their capability to call single nucleotide variants (SNVs) with allelic frequency as low as 0.025% in deep sequencing data. We analyzed a total of 54 simulated data with various sequencing depths and variant allele frequencies (VAFs), two reference data, and Horizon Tru-Q sample data. The results showed that the UMI-based callers, except smCounter2, outperformed the raw-reads-based callers regarding detection limit. Sequencing depth had almost no effect on the UMI-based callers but significantly influenced on the raw-reads-based callers. Regardless of the sequencing depth, MAGERI showed the fastest analysis, while smCounter2 consistently took the longest to finish the variant calling process. Overall, DeepSNVMiner and UMI-VarCal performed the best with considerably good sensitivity and precision of 88%, 100%, and 84%, 100%, respectively. In conclusion, the UMI-based callers, except smCounter2, outperformed the raw-reads-based callers in terms of sensitivity and precision. We recommend using DeepSNVMiner and UMI-VarCal for low-frequency variant detection. The results provide important information regarding future directions for reliable low-frequency variant detection and algorithm development, which is critical in genetics-based medical research and clinical applications.

SUBMITTER: Xiang X 

PROVIDER: S-EPMC10665316 | biostudies-literature | 2023 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Evaluating the performance of low-frequency variant calling tools for the detection of variants from short-read deep sequencing data.

Xiang Xudong X   Lu Bowen B   Song Dongyang D   Li Jie J   Shu Kunxian K   Pu Dan D  

Scientific reports 20231122 1


Detection of low-frequency variants with high accuracy plays an important role in biomedical research and clinical practice. However, it is challenging to do so with next-generation sequencing (NGS) approaches due to the high error rates of NGS. To accurately distinguish low-level true variants from these errors, many statistical variants calling tools for calling low-frequency variants have been proposed, but a systematic performance comparison of these tools has not yet been performed. Here, w  ...[more]

Similar Datasets

| S-EPMC4545535 | biostudies-literature
| S-EPMC6234735 | biostudies-literature
| S-EPMC11906795 | biostudies-literature
| S-EPMC9161029 | biostudies-literature
| S-EPMC10777354 | biostudies-literature
| S-EPMC5324109 | biostudies-literature
| S-EPMC8141913 | biostudies-literature
| S-EPMC10150536 | biostudies-literature
| S-EPMC8065719 | biostudies-literature
| S-EPMC6642177 | biostudies-literature