Ontology highlight
ABSTRACT: Motivation
The increasing availability of complete genomes demands for models to study genomic variability within entire populations. Pangenome graphs capture the full genomic similarity and diversity between multiple genomes. In order to understand them, we need to see them. For visualization, we need a human readable graph layout: A graph embedding in low (e.g. two) dimensional depictions. Due to a pangenome graph's potential excessive size, this is a significant challenge.Results
In response, we introduce a novel graph layout algorithm: the Path-Guided Stochastic Gradient Descent (PG-SGD). PG-SGD uses the genomes, represented in the pangenome graph as paths, as an embedded positional system to sample genomic distances between pairs of nodes. This avoids the quadratic cost seen in previous versions of graph drawing by Stochastic Gradient Descent (SGD). We show that our implementation efficiently computes the low dimensional layouts of gigabase-scale pangenome graphs, unveiling their biological features.Availability
We integrated PG-SGD in ODGI which is released as free software under the MIT open source license. Source code is available at https://github.com/pangenome/odgi.
SUBMITTER: Heumos S
PROVIDER: S-EPMC10542513 | biostudies-literature | 2023 Oct
REPOSITORIES: biostudies-literature
Heumos Simon S Guarracino Andrea A Schmelzle Jan-Niklas M JM Li Jiajie J Zhang Zhiru Z Hagmann Jörg J Nahnsen Sven S Prins Pjotr P Garrison Erik E
bioRxiv : the preprint server for biology 20231017
<h4>Motivation</h4>The increasing availability of complete genomes demands for models to study genomic variability within entire populations. Pangenome graphs capture the full genomic similarity and diversity between multiple genomes. In order to understand them, we need to see them. For visualization, we need a human readable graph layout: A graph embedding in low (e.g. two) dimensional depictions. Due to a pangenome graph's potential excessive size, this is a significant challenge.<h4>Results< ...[more]