{"database":"biostudies-literature","file_versions":[],"scores":null,"additional":{"submitter":["Jo J"],"funding":["National Research Foundation of Korea"],"pagination":["e0278570"],"full_dataset_link":["https://www.ebi.ac.uk/biostudies/studies/S-EPMC9714948"],"repository":["biostudies-literature"],"omics_type":["Unknown"],"volume":["17(12)"],"pubmed_abstract":["High-dimensional LASSO (Hi-LASSO) is a powerful feature selection tool for high-dimensional data. Our previous study showed that Hi-LASSO outperformed the other state-of-the-art LASSO methods. However, the substantial cost of bootstrapping and the lack of experiments for a parametric statistical test for feature selection have impeded to apply Hi-LASSO for practical applications. In this paper, the Python package and its Spark library are efficiently designed in a parallel manner for practice with real-world problems, as well as providing the capability of the parametric statistical tests for feature selection on high-dimensional data. We demonstrate Hi-LASSO's outperformance with various intensive experiments in a practical manner. Hi-LASSO will be efficiently and easily performed by using the packages for feature selection. Hi-LASSO packages are publicly available at https://github.com/datax-lab/Hi-LASSO under the MIT license. The packages can be easily installed by Python PIP, and additional documentation is available at https://pypi.org/project/hi-lasso and https://pypi.org/project/Hi-LASSO-spark."],"journal":["PloS one"],"pubmed_title":["Hi-LASSO: High-performance python and apache spark packages for feature selection with high-dimensional data."],"pmcid":["PMC9714948"],"funding_grant_id":["NRF-2021R1I1A3048029"],"pubmed_authors":["Park J","Kang M","Kim Y","Jo J","Jung S"],"additional_accession":[]},"is_claimable":false,"name":"Hi-LASSO: High-performance python and apache spark packages for feature selection with high-dimensional data.","description":"High-dimensional LASSO (Hi-LASSO) is a powerful feature selection tool for high-dimensional data. Our previous study showed that Hi-LASSO outperformed the other state-of-the-art LASSO methods. However, the substantial cost of bootstrapping and the lack of experiments for a parametric statistical test for feature selection have impeded to apply Hi-LASSO for practical applications. In this paper, the Python package and its Spark library are efficiently designed in a parallel manner for practice with real-world problems, as well as providing the capability of the parametric statistical tests for feature selection on high-dimensional data. We demonstrate Hi-LASSO's outperformance with various intensive experiments in a practical manner. Hi-LASSO will be efficiently and easily performed by using the packages for feature selection. Hi-LASSO packages are publicly available at https://github.com/datax-lab/Hi-LASSO under the MIT license. The packages can be easily installed by Python PIP, and additional documentation is available at https://pypi.org/project/hi-lasso and https://pypi.org/project/Hi-LASSO-spark.","dates":{"release":"2022-01-01T00:00:00Z","publication":"2022","modification":"2025-04-22T01:54:43.962Z","creation":"2025-04-05T20:07:28.297Z"},"accession":"S-EPMC9714948","cross_references":{"pubmed":["36455001"],"doi":["10.1371/journal.pone.0278570"]}}