Dataset Information


Derivation of molecular signatures for breast cancer recurrence prediction using a two-way validation approach.

ABSTRACT: Previous studies have demonstrated the potential value of gene expression signatures in assessing the risk of post-surgical breast cancer recurrence, however, many of these predictive models have been derived using simple computational algorithms and validated internally or using one-way validation on a single dataset. We have recently developed a new feature selection algorithm that overcomes some limitations inherent to high-dimensional data analysis. In this study, we applied this algorithm to two publicly available gene expression datasets obtained from over 400 patients with breast cancer to investigate whether we could derive more accurate prognostic signatures and reveal common predictive factors across independent datasets. We compared the performance of three advanced computational algorithms using a robust two-way validation method, where one dataset was used for training and to establish a prediction model that was then blindly tested on the other dataset. The experiment was then repeated in the reverse direction. Analyses identified prognostic signatures that while comprised of only 10-13 genes, significantly outperformed previously reported signatures for breast cancer evaluation. The cross-validation approach revealed CEGP1 and PRAME as major candidates for breast cancer biomarker development.


PROVIDER: S-EPMC2844120 | BioStudies | 2010-01-01

REPOSITORIES: biostudies

Similar Datasets

2016-01-01 | S-EPMC5001207 | BioStudies
2014-01-01 | S-EPMC4052785 | BioStudies
2012-01-01 | S-EPMC3314564 | BioStudies
2008-01-01 | S-EPMC2527336 | BioStudies
2020-01-01 | S-EPMC7299274 | BioStudies
2016-01-01 | S-EPMC4801472 | BioStudies
2013-01-01 | S-EPMC3558433 | BioStudies
2020-01-01 | S-EPMC7531359 | BioStudies
2011-01-01 | S-EPMC3233554 | BioStudies
2019-01-01 | S-EPMC6425557 | BioStudies