Unknown

Dataset Information

0

Novel bioinformatics quality control metric for next-generation sequencing experiments in the clinical context.


ABSTRACT: As the use of next-generation sequencing (NGS) for the Mendelian diseases diagnosis is expanding, the performance of this method has to be improved in order to achieve higher quality. Typically, performance measures are considered to be designed in the context of each application and, therefore, account for a spectrum of clinically relevant variants. We present EphaGen, a new computational methodology for bioinformatics quality control (QC). Given a single NGS dataset in BAM format and a pre-compiled VCF-file of targeted clinically relevant variants it associates this dataset with a single arbiter parameter. Intrinsically, EphaGen estimates the probability to miss any variant from the defined spectrum within a particular NGS dataset. Such performance measure virtually resembles the diagnostic sensitivity of given NGS dataset. Here we present case studies of the use of EphaGen in context of BRCA1/2 and CFTR sequencing in a series of 14 runs across 43 blood samples and 504 publically available NGS datasets. EphaGen is superior to conventional bioinformatics metrics such as coverage depth and coverage uniformity. We recommend using this software as a QC step in NGS studies in the clinical context. Availability: https://github.com/m4merg/EphaGen or https://hub.docker.com/r/m4merg/ephagen.

SUBMITTER: Ivanov M 

PROVIDER: S-EPMC6868350 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Novel bioinformatics quality control metric for next-generation sequencing experiments in the clinical context.

Ivanov Maxim M   Ivanov Mikhail M   Kasianov Artem A   Rozhavskaya Ekaterina E   Musienko Sergey S   Baranova Ancha A   Mileyko Vladislav V  

Nucleic acids research 20191201 21


As the use of next-generation sequencing (NGS) for the Mendelian diseases diagnosis is expanding, the performance of this method has to be improved in order to achieve higher quality. Typically, performance measures are considered to be designed in the context of each application and, therefore, account for a spectrum of clinically relevant variants. We present EphaGen, a new computational methodology for bioinformatics quality control (QC). Given a single NGS dataset in BAM format and a pre-com  ...[more]

Similar Datasets

| S-EPMC5506542 | biostudies-other
| S-EPMC4018527 | biostudies-literature
| S-EPMC8408346 | biostudies-literature
| S-EPMC7019349 | biostudies-literature
| S-EPMC7934511 | biostudies-literature
| S-EPMC8332578 | biostudies-literature
| S-EPMC7031678 | biostudies-literature