Unknown

Dataset Information

0

Genome-wide scans for selective sweeps using convolutional neural networks.


ABSTRACT:

Motivation

Recent methods for selective sweep detection cast the problem as a classification task and use summary statistics as features to capture region characteristics that are indicative of a selective sweep, thereby being sensitive to confounding factors. Furthermore, they are not designed to perform whole-genome scans or to estimate the extent of the genomic region that was affected by positive selection; both are required for identifying candidate genes and the time and strength of selection.

Results

We present ASDEC (https://github.com/pephco/ASDEC), a neural-network-based framework that can scan whole genomes for selective sweeps. ASDEC achieves similar classification performance to other convolutional neural network-based classifiers that rely on summary statistics, but it is trained 10× faster and classifies genomic regions 5× faster by inferring region characteristics from the raw sequence data directly. Deploying ASDEC for genomic scans achieved up to 15.2× higher sensitivity, 19.4× higher success rates, and 4× higher detection accuracy than state-of-the-art methods. We used ASDEC to scan human chromosome 1 of the Yoruba population (1000Genomes project), identifying nine known candidate genes.

SUBMITTER: Zhao H 

PROVIDER: S-EPMC10311404 | biostudies-literature | 2023 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Genome-wide scans for selective sweeps using convolutional neural networks.

Zhao Hanqing H   Souilljee Matthijs M   Pavlidis Pavlos P   Alachiotis Nikolaos N  

Bioinformatics (Oxford, England) 20230601 39 Suppl 1


<h4>Motivation</h4>Recent methods for selective sweep detection cast the problem as a classification task and use summary statistics as features to capture region characteristics that are indicative of a selective sweep, thereby being sensitive to confounding factors. Furthermore, they are not designed to perform whole-genome scans or to estimate the extent of the genomic region that was affected by positive selection; both are required for identifying candidate genes and the time and strength o  ...[more]

Similar Datasets

| S-EPMC8328518 | biostudies-literature
| S-EPMC5943620 | biostudies-other
| S-EPMC10579320 | biostudies-literature
| S-EPMC10001035 | biostudies-literature
| S-EPMC5870713 | biostudies-literature
| S-EPMC7924482 | biostudies-literature
| S-EPMC11436835 | biostudies-literature
| S-EPMC4918448 | biostudies-literature
| S-EPMC10167470 | biostudies-literature
| S-EPMC6618066 | biostudies-literature