Genomics

Dataset Information

0

SeqControl: Process Control for DNA Sequencing


ABSTRACT: As next-generation sequencing (NGS) continues to increase in speed and throughput, routine clinical and industrial application draws steadily closer. These “production” uses of NGS will require enhanced quality-monitoring and quality-control to optimize output and reduce costs. We therefore developed a framework called SeqControl for predicting sequencing quality and coverage using a set of 15 metrics describing overall coverage, coverage distribution, base-wise coverage and base-wise quality. Using whole-genome sequences of 27 prostate cancers and 26 normal references we derive multivariate models that predict sequencing quality and depth. SeqControl robustly predicts how much sequencing is required to reach a given coverage depth (AUC = 0.993), accurately classifies clinically relevant formalin-fixed paraffin-embedded samples and makes predictions from as little as 1/8 of a lane of sequencing data (AUC = 0.967). These techniques can be immediately incorporated into existing NGS pipelines to monitor data quality in real-time. SeqControl represents a first step towards statistical process-control for NGS.

PROVIDER: EGAS00001000899 | EGA |

REPOSITORIES: EGA

Similar Datasets

| 2219385 | ecrin-mdr-crc
2016-01-07 | E-GEOD-76618 | biostudies-arrayexpress
2013-04-02 | E-GEOD-33294 | biostudies-arrayexpress
2013-04-02 | GSE33294 | GEO
2018-08-31 | GSE117010 | GEO
2022-03-31 | PXD026806 | Pride
2018-04-18 | PXD003101 | Pride
2014-10-13 | E-GEOD-49636 | biostudies-arrayexpress
2020-10-22 | GSE144296 | GEO
| phs000950 | dbGaP