A machine learning framework to predict cancer metabolomics from gene-expression data
Ontology highlight
ABSTRACT: Metabolomics provides a direct functional readout of a tumor’s physiology. Yet, it is lagging behind other omics technologies in facilitating disease monitoring and prognostication. This stems partly from the scarcity of large-scale metabolomic studies, but also the analytical complexities of detecting diverse metabolites with varying physicochemical properties and concentrations. To address this, we developed a machine learning framework using both tumor tissue and cell line samples across multiple cancer types that allows prediction of metabolomics from gene expression data. Two different model types were selected and trained for tissues and cell lines with their generalization capacity validated on independent cohorts, accurately predicting as high as 70-80% of tested metabolites. This work offers a scalable and efficient machine learning pipeline to determine metabolic from transcriptomic signatures, opening avenues to reconstruct and study the metabolic landscape of samples across novel and existing datasets lacking direct metabolomics measurements.
ORGANISM(S): Homo sapiens
PROVIDER: GSE299429 | GEO | 2025/09/30
REPOSITORIES: GEO
ACCESS DATA