Unknown

Dataset Information

0

G-SAIP: Graphical Sequence Alignment Through Parallel Programming in the Post-Genomic Era.


ABSTRACT: A common task in bioinformatics is to compare DNA sequences to identify similarities between organisms at the sequence level. An approach to such comparison is the dot-plots, a 2-dimensional graphical representation to analyze DNA or protein alignments. Dot-plots alignment software existed before the sequencing revolution, and now there is an ongoing limitation when dealing with large-size sequences, resulting in very long execution times. High-Performance Computing (HPC) techniques have been successfully used in many applications to reduce computing times, but so far, very few applications for graphical sequence alignment using HPC have been reported. Here, we present G-SAIP (Graphical Sequence Alignment in Parallel), a software capable of spawning multiple distributed processes on CPUs, over a supercomputing infrastructure to speed up the execution time for dot-plot generation up to 1.68× compared with other current fastest tools, improve the efficiency for comparative structural genomic analysis, phylogenetics because the benefits of pairwise alignments for comparison between genomes, repetitive structure identification, and assembly quality checking.

SUBMITTER: Pina JS 

PROVIDER: S-EPMC9871978 | biostudies-literature | 2023

REPOSITORIES: biostudies-literature

altmetric image

Publications

G-SAIP: Graphical Sequence Alignment Through Parallel Programming in the Post-Genomic Era.

Piña Johan S JS   Orozco-Arias Simon S   Tobón-Orozco Nicolas N   Camargo-Forero Leonardo L   Tabares-Soto Reinel R   Guyot Romain R  

Evolutionary bioinformatics online 20230120


A common task in bioinformatics is to compare DNA sequences to identify similarities between organisms at the sequence level. An approach to such comparison is the dot-plots, a 2-dimensional graphical representation to analyze DNA or protein alignments. Dot-plots alignment software existed before the sequencing revolution, and now there is an ongoing limitation when dealing with large-size sequences, resulting in very long execution times. High-Performance Computing (HPC) techniques have been su  ...[more]

Similar Datasets

| S-EPMC6761980 | biostudies-literature
| S-EPMC2668612 | biostudies-literature
| S-EPMC4253828 | biostudies-literature
| S-EPMC4221118 | biostudies-literature
| S-EPMC5299732 | biostudies-literature
| S-EPMC5611559 | biostudies-literature
| S-EPMC138846 | biostudies-literature
| S-EPMC5693744 | biostudies-literature
| S-EPMC2752613 | biostudies-literature
| S-EPMC4435022 | biostudies-other