Unknown

Dataset Information

0

Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries.


ABSTRACT:

Purpose

Tools for survival prediction for non-small cell lung cancer (NSCLC) patients treated with chemoradiation or radiation therapy are of limited quality. In this work, we developed a predictive model of survival at 2 years. The model is based on a large volume of historical patient data and serves as a proof of concept to demonstrate the distributed learning approach.

Methods and materials

Clinical data from 698 lung cancer patients, treated with curative intent with chemoradiation or radiation therapy alone, were collected and stored at 2 different cancer institutes (559 patients at Maastro clinic (Netherlands) and 139 at Michigan university [United States]). The model was further validated on 196 patients originating from The Christie (United Kingdon). A Bayesian network model was adapted for distributed learning (the animation can be viewed at https://www.youtube.com/watch?v=ZDJFOxpwqEA). Two-year posttreatment survival was chosen as the endpoint. The Maastro clinic cohort data are publicly available at https://www.cancerdata.org/publication/developing-and-validating-survival-prediction-model-nsclc-patients-through-distributed, and the developed models can be found at www.predictcancer.org.

Results

Variables included in the final model were T and N category, age, performance status, and total tumor dose. The model has an area under the curve (AUC) of 0.66 on the external validation set and an AUC of 0.62 on a 5-fold cross validation. A model based on the T and N category performed with an AUC of 0.47 on the validation set, significantly worse than our model (P<.001). Learning the model in a centralized or distributed fashion yields a minor difference on the probabilities of the conditional probability tables (0.6%); the discriminative performance of the models on the validation set is similar (P=.26).

Conclusions

Distributed learning from federated databases allows learning of predictive models on data originating from multiple institutions while avoiding many of the data-sharing barriers. We believe that distributed learning is the future of sharing data in health care.

SUBMITTER: Jochems A 

PROVIDER: S-EPMC5575360 | biostudies-literature | 2017 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries.

Jochems Arthur A   Deist Timo M TM   El Naqa Issam I   Kessler Marc M   Mayo Chuck C   Reeves Jackson J   Jolly Shruti S   Matuszak Martha M   Ten Haken Randall R   van Soest Johan J   Oberije Cary C   Faivre-Finn Corinne C   Price Gareth G   de Ruysscher Dirk D   Lambin Philippe P   Dekker Andre A  

International journal of radiation oncology, biology, physics 20170424 2


<h4>Purpose</h4>Tools for survival prediction for non-small cell lung cancer (NSCLC) patients treated with chemoradiation or radiation therapy are of limited quality. In this work, we developed a predictive model of survival at 2 years. The model is based on a large volume of historical patient data and serves as a proof of concept to demonstrate the distributed learning approach.<h4>Methods and materials</h4>Clinical data from 698 lung cancer patients, treated with curative intent with chemorad  ...[more]

Similar Datasets

| S-EPMC10518974 | biostudies-literature
| S-EPMC8085432 | biostudies-literature
| S-EPMC11491655 | biostudies-literature
| S-EPMC8031110 | biostudies-literature
| S-EPMC9906091 | biostudies-literature
| S-EPMC9343202 | biostudies-literature
| S-EPMC9856360 | biostudies-literature
| S-EPMC8767133 | biostudies-literature
| S-EPMC7987139 | biostudies-literature
| S-EPMC9718161 | biostudies-literature