Unknown

Dataset Information

0

INSIDER: alignment-free detection of foreign DNA sequences.


ABSTRACT: External DNA sequences can be inserted into an organism's genome either through natural processes such as gene transfer, or through targeted genome engineering strategies. Being able to robustly identify such foreign DNA is a crucial capability for health and biosecurity applications, such as anti-microbial resistance (AMR) detection or monitoring gene drives. This capability does not exist for poorly characterised host genomes or with limited information about the integrated sequence. To address this, we developed the INserted Sequence Information DEtectoR (INSIDER). INSIDER analyses whole genome sequencing data and identifies segments of potentially foreign origin by their significant shift in k-mer signatures. We demonstrate the power of INSIDER to separate integrated DNA sequences from normal genomic sequences on a synthetic dataset simulating the insertion of a CRISPR-Cas gene drive into wild-type yeast. As a proof-of-concept, we use INSIDER to detect the exact AMR plasmid in whole genome sequencing data from a Citrobacter freundii patient isolate. INSIDER streamlines the process of identifying integrated DNA in poorly characterised wild species or when the insert is of unknown origin, thus enhancing the monitoring of emerging biosecurity threats.

SUBMITTER: Tay AP 

PROVIDER: S-EPMC8273350 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

altmetric image

Publications

INSIDER: alignment-free detection of foreign DNA sequences.

Tay Aidan P AP   Hosking Brendan B   Hosking Cameron C   Bauer Denis C DC   Bauer Denis C DC   Wilson Laurence O W LOW  

Computational and structural biotechnology journal 20210629


External DNA sequences can be inserted into an organism's genome either through natural processes such as gene transfer, or through targeted genome engineering strategies. Being able to robustly identify such foreign DNA is a crucial capability for health and biosecurity applications, such as anti-microbial resistance (AMR) detection or monitoring gene drives. This capability does not exist for poorly characterised host genomes or with limited information about the integrated sequence. To addres  ...[more]

Similar Datasets

| S-EPMC11272466 | biostudies-literature
| S-EPMC4230918 | biostudies-literature
| S-EPMC10483029 | biostudies-literature
| S-EPMC4434998 | biostudies-literature
| S-EPMC9645238 | biostudies-literature
| S-EPMC9858667 | biostudies-literature
| S-EPMC9171953 | biostudies-literature
| S-EPMC6391537 | biostudies-literature
| S-EPMC6245493 | biostudies-literature
| S-EPMC7693541 | biostudies-literature