FLYNC: A Machine Learning-Driven Framework for Discovering Long Non-Coding RNAs in Drosophila melanogaster
Ontology highlight
ABSTRACT: Non-coding RNAs have increasingly recognized roles in critical molecular mechanisms of disease. However, the non-coding genome of Drosophila melanogaster, one of the most powerful disease model organisms, has been understudied. Here, we present FLYNC – FLY Non-Coding RNA discovery and classification – a novel machine learning model that predicts the probability of a newly identified RNA transcript being a long non-coding RNA (lncRNA). Integrated into an end-to-end bioinformatics pipeline capable of processing single-cell or bulk RNA sequencing data, FLYNC outputs potential new non-coding RNA genes. FLYNC leverages large-scale genomic and transcriptomic datasets to identify patterns and features that distinguish non-coding genes from protein-coding genes, thereby facilitating lncRNA prediction. We demonstrate the application of FLYNC to publicly available Drosophila adult head bulk transcriptome and single-cell transcriptomic data from Drosophila neural stem cell lineages and identify several novel tissue- and cell-specific lncRNAs. We have further experimentally validated the existence of a set of FLYNC predicted lncRNAs by RT-PCR and RNA PolII binding. Overall, our findings demonstrate that FLYNC serves as a robust tool for identifying lncRNAs in Drosophila melanogaster, transcending current limitations in ncRNA identification and harnessing the potential of machine learning.
ORGANISM(S): Drosophila melanogaster
PROVIDER: GSE282899 | GEO | 2026/01/21
REPOSITORIES: GEO
ACCESS DATA