Ontology highlight
ABSTRACT: Motivation
As sequencing technologies and analysis pipelines evolve, de novo mutation (DNM) calling tools must be adapted. Therefore, a flexible approach is needed that can accurately identify DNMs from genome or exome sequences from a variety of datasets and variant calling pipelines.Results
Here, we describe SynthDNM, a random-forest based classifier that can be readily adapted to new sequencing or variant-calling pipelines by applying a flexible approach to constructing simulated training examples from real data. The optimized SynthDNM classifiers predict de novo SNPs and indels with robust accuracy across multiple methods of variant calling.Availabilityand implementation
SynthDNM is freely available on Github (https://github.com/james-guevara/synthdnm).Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Lian A
PROVIDER: S-EPMC8545295 | biostudies-literature | 2021 Oct
REPOSITORIES: biostudies-literature
Lian Aojie A Guevara James J Xia Kun K Sebat Jonathan J
Bioinformatics (Oxford, England) 20211001 20
<h4>Motivation</h4>As sequencing technologies and analysis pipelines evolve, de novo mutation (DNM) calling tools must be adapted. Therefore, a flexible approach is needed that can accurately identify DNMs from genome or exome sequences from a variety of datasets and variant calling pipelines.<h4>Results</h4>Here, we describe SynthDNM, a random-forest based classifier that can be readily adapted to new sequencing or variant-calling pipelines by applying a flexible approach to constructing simula ...[more]