Unknown

Dataset Information

0

Jasmine and Iris: population-scale structural variant comparison and analysis.


ABSTRACT: The availability of long reads is revolutionizing studies of structural variants (SVs). However, because SVs vary across individuals and are discovered through imprecise read technologies and methods, they can be difficult to compare. Addressing this, we present Jasmine and Iris ( https://github.com/mkirsche/Jasmine/ ), for fast and accurate SV refinement, comparison and population analysis. Using an SV proximity graph, Jasmine outperforms six widely used comparison methods, including reducing the rate of Mendelian discordance in trio datasets by more than fivefold, and reveals a set of high-confidence de novo SVs confirmed by multiple technologies. We also present a unified callset of 122,813 SVs and 82,379 indels from 31 samples of diverse ancestry sequenced with long reads. We genotype these variants in 1,317 samples from the 1000 Genomes Project and the Genotype-Tissue Expression project with DNA and RNA-sequencing data and assess their widespread impact on gene expression, including within medically relevant genes.

SUBMITTER: Kirsche M 

PROVIDER: S-EPMC10006329 | biostudies-literature | 2023 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Jasmine and Iris: population-scale structural variant comparison and analysis.

Kirsche Melanie M   Prabhu Gautam G   Sherman Rachel R   Ni Bohan B   Battle Alexis A   Aganezov Sergey S   Schatz Michael C MC  

Nature methods 20230119 3


The availability of long reads is revolutionizing studies of structural variants (SVs). However, because SVs vary across individuals and are discovered through imprecise read technologies and methods, they can be difficult to compare. Addressing this, we present Jasmine and Iris ( https://github.com/mkirsche/Jasmine/ ), for fast and accurate SV refinement, comparison and population analysis. Using an SV proximity graph, Jasmine outperforms six widely used comparison methods, including reducing t  ...[more]

Similar Datasets

| S-EPMC5345777 | biostudies-literature
| S-EPMC6853660 | biostudies-literature
| S-EPMC8779377 | biostudies-literature
| S-EPMC7751401 | biostudies-literature
| S-EPMC9793516 | biostudies-literature
| S-EPMC10572636 | biostudies-literature
| PRJEB28042 | ENA
| S-EPMC1857026 | biostudies-literature
| S-EPMC4448687 | biostudies-literature
| S-EPMC10976732 | biostudies-literature