Unknown

Dataset Information

0

Highly efficient clustering of long-read transcriptomic data with GeLuster.


ABSTRACT:

Motivation

The advancement of long-read RNA sequencing technologies leads to a bright future for transcriptome analysis, in which clustering long reads according to their gene family of origin is of great importance. However, existing de novo clustering algorithms require plenty of computing resources.

Results

We developed a new algorithm GeLuster for clustering long RNA-seq reads. Based on our tests on one simulated dataset and nine real datasets, GeLuster exhibited superior performance. On the tested Nanopore datasets it ran 2.9-17.5 times as fast as the second-fastest method with less than one-seventh of memory consumption, while achieving higher clustering accuracy. And on the PacBio data, GeLuster also had a similar performance. It sets the stage for large-scale transcriptome study in future.

Availability and implementation

GeLuster is freely available at https://github.com/yutingsdu/GeLuster.

SUBMITTER: Ma J 

PROVIDER: S-EPMC10881092 | biostudies-literature | 2024 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Highly efficient clustering of long-read transcriptomic data with GeLuster.

Ma Junchi J   Zhao Xiaoyu X   Qi Enfeng E   Han Renmin R   Yu Ting T   Li Guojun G  

Bioinformatics (Oxford, England) 20240201 2


<h4>Motivation</h4>The advancement of long-read RNA sequencing technologies leads to a bright future for transcriptome analysis, in which clustering long reads according to their gene family of origin is of great importance. However, existing de novo clustering algorithms require plenty of computing resources.<h4>Results</h4>We developed a new algorithm GeLuster for clustering long RNA-seq reads. Based on our tests on one simulated dataset and nine real datasets, GeLuster exhibited superior perf  ...[more]

Similar Datasets

| S-EPMC10868325 | biostudies-literature
| S-EPMC10777354 | biostudies-literature
| S-EPMC7673114 | biostudies-literature
| S-EPMC9900919 | biostudies-literature
| S-EPMC9985341 | biostudies-literature
| S-EPMC7100596 | biostudies-literature
| S-EPMC4253826 | biostudies-literature
| S-EPMC7931822 | biostudies-literature
| S-EPMC10354735 | biostudies-literature
| S-EPMC6009963 | biostudies-literature