Unknown

Dataset Information

0

MISNN: Multiple Imputation via Semi-parametric Neural Networks.


ABSTRACT: Multiple imputation (MI) has been widely applied to missing value problems in biomedical, social and econometric research, in order to avoid improper inference in the downstream data analysis. In the presence of high-dimensional data, imputation models that include feature selection, especially ℓ1 regularized regression (such as Lasso, adaptive Lasso, and Elastic Net), are common choices to prevent the model from underdetermination. However, conducting MI with feature selection is difficult: existing methods are often computationally inefficient and poor in performance. We propose MISNN, a novel and efficient algorithm that incorporates feature selection for MI. Leveraging the approximation power of neural networks, MISNN is a general and flexible framework, compatible with any feature selection method, any neural network architecture, high/low-dimensional data and general missing patterns. Through empirical experiments, MISNN has demonstrated great advantages over state-of-the-art imputation methods (e.g. Bayesian Lasso and matrix completion), in terms of imputation accuracy, statistical consistency and computation speed.

SUBMITTER: Bu Z 

PROVIDER: S-EPMC10869892 | biostudies-literature | 2023 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

MISNN: Multiple Imputation via Semi-parametric Neural Networks.

Bu Zhiqi Z   Dai Zongyu Z   Zhang Yiliang Y   Long Qi Q  

Advances in knowledge discovery and data mining : ... Pacific-Asia Conference, PAKDD ..., proceedings. Pacific-Asia Conference on Knowledge Discovery and Data Mining 20230527


Multiple imputation (MI) has been widely applied to missing value problems in biomedical, social and econometric research, in order to avoid improper inference in the downstream data analysis. In the presence of high-dimensional data, imputation models that include feature selection, especially ℓ1 regularized regression (such as Lasso, adaptive Lasso, and Elastic Net), are common choices to prevent the model from underdetermination. However, conducting MI with feature selection is difficult: exi  ...[more]

Similar Datasets

| S-EPMC7612649 | biostudies-literature
| S-EPMC6581753 | biostudies-literature
| S-EPMC9136914 | biostudies-literature
| S-EPMC6994606 | biostudies-literature
| S-EPMC5963472 | biostudies-literature
| S-EPMC9035098 | biostudies-literature
| S-EPMC11373339 | biostudies-literature
| S-EPMC7596726 | biostudies-literature
| S-EPMC6563871 | biostudies-literature
| S-EPMC4860150 | biostudies-literature