Dataset Information

Stable Iterative Variable Selection.

ABSTRACT:

Motivation

The emergence of datasets with tens of thousands of features, such as high-throughput omics biomedical data, highlights the importance of reducing the feature space into a distilled subset that can truly capture the signal for research and industry by aiding in finding more effective biomarkers for the question in hand. A good feature set also facilitates building robust predictive models with improved interpretability and convergence of the applied method due to the smaller feature space.

Results

Here, we present a robust feature selection method named Stable Iterative Variable Selection (SIVS) and assess its performance over both omics and clinical data types. As a performance assessment metric, we compared the number and goodness of the selected feature using SIVS to those selected by Least Absolute Shrinkage and Selection Operator regression. The results suggested that the feature space selected by SIVS was, on average, 41% smaller, without having a negative effect on the model performance. A similar result was observed for comparison with Boruta and caret RFE.

Availability and implementation

The method is implemented as an R package under GNU General Public License v3.0 and is accessible via Comprehensive R Archive Network (CRAN) via https://cran.r-project.org/package=sivs.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Mahmoudian M

PROVIDER: S-EPMC8665768 | biostudies-literature | 2021 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Stable Iterative Variable Selection.

Mahmoudian Mehrad M Venäläinen Mikko S MS Klén Riku R Elo Laura L LL

Bioinformatics (Oxford, England) 20211201 24

<h4>Motivation</h4>The emergence of datasets with tens of thousands of features, such as high-throughput omics biomedical data, highlights the importance of reducing the feature space into a distilled subset that can truly capture the signal for research and industry by aiding in finding more effective biomarkers for the question in hand. A good feature set also facilitates building robust predictive models with improved interpretability and convergence of the applied method due to the smaller f ...[more]

PMID: 34270690

Dataset Information

Stable Iterative Variable Selection.

Motivation

Results

Availability and implementation

Supplementary information

Publications

Stable Iterative Variable Selection.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Fast Model-Fitting of Bayesian Variable Selection Regression Using the Iterative Complex Factorization Algorithm.
| S-EPMC6788783 | biostudies-literature

Stable variable ranking and selection in regularized logistic regression for severely imbalanced big binary data.
| S-EPMC9844919 | biostudies-literature

A stable iterative method for refining discriminative gene clusters.
| S-EPMC2559882 | biostudies-literature

ADAPTIVE ROBUST VARIABLE SELECTION.
| S-EPMC4286898 | biostudies-literature

Variable selection in ROC regression.
| S-EPMC3838845 | biostudies-literature

Variable Selection via Partial Correlation.
| S-EPMC5484095 | biostudies-literature

Iterative stable alignment and clustering of 2D transmission electron microscope images.
| S-EPMC3426367 | biostudies-literature

Iterative random forests to discover predictive and stable high-order interactions.
| S-EPMC5828575 | biostudies-literature

Liquid-based iterative recombineering method tolerant to counter-selection escapes.
| S-EPMC4361647 | biostudies-literature

Iterative class discovery and feature selection using Minimal Spanning Trees.
| S-EPMC520744 | biostudies-literature