Unknown

Dataset Information

0

Calling Structural Variants with Confidence from Short-Read Data in Wild Bird Populations.


ABSTRACT: Comprehensive characterization of structural variation in natural populations has only become feasible in the last decade. To investigate the population genomic nature of structural variation, reproducible and high-confidence structural variation callsets are first required. We created a population-scale reference of the genome-wide landscape of structural variation across 33 Nordic house sparrows (Passer domesticus). To produce a consensus callset across all samples using short-read data, we compare heuristic-based quality filtering and visual curation (Samplot/PlotCritic and Samplot-ML) approaches. We demonstrate that curation of structural variants is important for reducing putative false positives and that the time invested in this step outweighs the potential costs of analyzing short-read-discovered structural variation data sets that include many potential false positives. We find that even a lenient manual curation strategy (e.g. applied by a single curator) can reduce the proportion of putative false positives by up to 80%, thus enriching the proportion of high-confidence variants. Crucially, in applying a lenient manual curation strategy with a single curator, nearly all (>99%) variants rejected as putative false positives were also classified as such by a more stringent curation strategy using three additional curators. Furthermore, variants rejected by manual curation failed to reflect the expected population structure from SNPs, whereas variants passing curation did. Combining heuristic-based quality filtering with rapid manual curation of structural variants in short-read data can therefore become a time- and cost-effective first step for functional and population genomic studies requiring high-confidence structural variation callsets.

SUBMITTER: David G 

PROVIDER: S-EPMC11018544 | biostudies-literature | 2024 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Calling Structural Variants with Confidence from Short-Read Data in Wild Bird Populations.

David Gabriel G   Bertolotti Alicia A   Layer Ryan R   Scofield Douglas D   Hayward Alexander A   Baril Tobias T   Burnett Hamish A HA   Gudmunds Erik E   Jensen Henrik H   Husby Arild A  

Genome biology and evolution 20240401 4


Comprehensive characterization of structural variation in natural populations has only become feasible in the last decade. To investigate the population genomic nature of structural variation, reproducible and high-confidence structural variation callsets are first required. We created a population-scale reference of the genome-wide landscape of structural variation across 33 Nordic house sparrows (Passer domesticus). To produce a consensus callset across all samples using short-read data, we co  ...[more]

Similar Datasets

| S-EPMC11373317 | biostudies-literature
| S-EPMC10241795 | biostudies-literature
| S-EPMC8138798 | biostudies-literature
| S-EPMC4545535 | biostudies-literature
| S-EPMC10777354 | biostudies-literature
| S-EPMC6642177 | biostudies-literature
| S-EPMC9900919 | biostudies-literature
| S-EPMC11616694 | biostudies-literature
| S-EPMC10665316 | biostudies-literature
| S-EPMC10946394 | biostudies-literature