Unknown

Dataset Information

0

TVAE-RNA: ensemble-based RNA secondary structure prediction via transformer variational autoencoders.


ABSTRACT:

Motivation

Accurate prediction of RNA secondary structure remains challenging due to the presence of pseudoknots, long-range dependencies, and limited labeled data.

Results

We propose TVAE, a novel framework that integrates a Transformer encoder with a Variational Autoencoder (VAE). The Transformer captures global dependencies in the sequence, while the VAE models structural variability by learning a probabilistic latent space. Unlike deterministic models, TVAE generates diverse and biologically plausible secondary structures, enabling more comprehensive structure discovery. To obtain discrete predictions, we introduce GHA-Pairing, a fast and biologically constrained base-pairing algorithm. TVAE demonstrates strong generalization across different RNA families and achieves state-of-the-art performance on benchmark datasets, reaching an F1 score of 0.89 and 83% accuracy, surpassing existing methods by 10%. These results highlight the advantage of probabilistic modeling for RNA structure prediction and its potential to enhance biological insights.

Availability and implementation

Code and pretrained models are available at https://github.com/mei-rna/TVAE-RNA. The released version of the dataset and models can also be accessed via DOI: 10.5281/zenodo.16946114.

SUBMITTER: Mei X 

PROVIDER: S-EPMC12640237 | biostudies-literature | 2025 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

TVAE-RNA: ensemble-based RNA secondary structure prediction via transformer variational autoencoders.

Mei Xiyuan X   Liu Hanbo H   Zhu Yuheng Y   Zhao Enshuang E   Li Longyi L   Zhang Hao H  

Bioinformatics (Oxford, England) 20251101 11


<h4>Motivation</h4>Accurate prediction of RNA secondary structure remains challenging due to the presence of pseudoknots, long-range dependencies, and limited labeled data.<h4>Results</h4>We propose TVAE, a novel framework that integrates a Transformer encoder with a Variational Autoencoder (VAE). The Transformer captures global dependencies in the sequence, while the VAE models structural variability by learning a probabilistic latent space. Unlike deterministic models, TVAE generates diverse a  ...[more]

Similar Datasets

| S-EPMC3750279 | biostudies-literature
| S-EPMC1370799 | biostudies-literature
| S-EPMC5836418 | biostudies-literature
| S-EPMC9710582 | biostudies-literature
| S-EPMC10280414 | biostudies-literature
| S-EPMC11966612 | biostudies-literature
| S-EPMC9882246 | biostudies-literature
| S-EPMC3819574 | biostudies-literature