Proteomics

Dataset Information

0

Federated deep learning enables cancer subtyping by proteomics


ABSTRACT: Artificial intelligence (AI) applications in biomedical settings face challenges such as data privacy and regulatory compliance. Federated Deep Learning (FDL) effectively addresses these issues. We developed ProCanFDL, where local models were trained on simulated sites using proteomic data drawn from a pan-cancer cohort (n = 1,260) and 29 other cohorts (n = 6,265), representing 4,956 patients and 19,930 mass spectrometry (MS) runs, all held behind private firewalls. Local parameter updates were aggregated to build the global model, achieving a 43% performance gain over local models on the hold-out test set (n = 625) in 14 cancer subtyping tasks. Additionally, ProCanFDL preserved data privacy while matching centralized model performance. External validation assessed generalization by retraining the global model with data from two external cohorts (n = 55) and eight (n = 832) using a different MS technology. ProCanFDL presents a solution for internationally collaborative machine learning initiatives using proteomic data while maintaining data privacy.

INSTRUMENT(S): TripleTOF 6600

ORGANISM(S): Homo Sapiens (human)

TISSUE(S): Whole Body

SUBMITTER: Zainab Noor  

LAB HEAD: Peter G

PROVIDER: PXD056810 | Pride | 2025-05-29

REPOSITORIES: Pride

Dataset's files

Source:

Similar Datasets

2024-01-23 | PXD037038 | Pride
2017-08-28 | PXD005710 | Pride
2024-01-12 | PXD044157 | Pride
2024-12-31 | E-MTAB-6775 | biostudies-arrayexpress
2020-07-15 | E-MTAB-6708 | biostudies-arrayexpress
2015-11-21 | E-GEOD-64257 | biostudies-arrayexpress
2007-09-06 | E-GEOD-3189 | biostudies-arrayexpress
2023-03-11 | PXD022411 | Pride
2025-05-06 | PXD035206 | Pride
2015-11-21 | E-GEOD-64258 | biostudies-arrayexpress