Unknown

Dataset Information

0

Identifying centromeric satellites with dna-brnn.


ABSTRACT: SUMMARY:Human alpha satellite and satellite 2/3 contribute to several percent of the human genome. However, identifying these sequences with traditional algorithms is computationally intensive. Here we develop dna-brnn, a recurrent neural network to learn the sequences of the two classes of centromeric repeats. It achieves high similarity to RepeatMasker and is times faster. Dna-brnn explores a novel application of deep learning and may accelerate the study of the evolution of the two repeat classes. AVAILABILITY AND IMPLEMENTATION:https://github.com/lh3/dna-nn.

SUBMITTER: Li H 

PROVIDER: S-EPMC6821349 | BioStudies | 2019-01-01

REPOSITORIES: biostudies

Similar Datasets

2016-01-01 | S-EPMC4756297 | BioStudies
2001-01-01 | S-EPMC311043 | BioStudies
2008-01-01 | S-EPMC2672092 | BioStudies
2018-01-01 | S-EPMC5844345 | BioStudies
2013-01-01 | S-EPMC3553004 | BioStudies
1000-01-01 | S-EPMC4079201 | BioStudies
2018-01-08 | GSE105100 | GEO
2015-01-01 | S-EPMC4585704 | BioStudies
2018-01-01 | S-EPMC6140926 | BioStudies
2005-01-01 | S-EPMC1187982 | BioStudies