Impact of the power of adaptive weight on penalized logistic regression: Application to cancer classification

Tehran University of Medical Sciences Journal of Biostatistics and Epidemiology 2383-4196 10 3 2024 12 15 Impact of the power of adaptive weight on penalized logistic regression: Application to cancer classification 253 272 Narumol Sudjai Department of Orthopaedic Surgery, Faculty of Medicine Siriraj Hospital, Mahidol University Monthira Duangsaphon Department of Mathematics and Statistics, Faculty of Science and Technology, Thammasat University Chandhanarat Chandhanayingyong Department of Orthopaedic Surgery, Faculty of Medicine Siriraj Hospital, Mahidol University 2023 12 22 2024 12 04 Background: Hybrid of the high-dimensional sparse data and multicollinearity problems can cause instabilities in classification models when applying them to new datasets. The Lasso, or Least Absolute Shrinkage and Selection Operator, is popularly used in machine-learning algorithm. Despite its computational feasibility for high-dimensional data, this method has certain drawbacks. Consequently, the adaptive Lasso was developed to solve these problems. Power of adaptive weight for this estimator is one of the important parameters. Therefore, we concentrate on the power of adaptive weight for the penalty functions. This study aimed to compare the impact of the power of adaptive weight on penalized logistic regression under high-dimensional sparse data with multicollinearity. Methods: A penalized approaches were used to apply the variable selection and parameter estimates. The Monte Carlo simulation was performed using 50 and 1000 independent variables and sample size equal to 30/40. Degree of correlation was set to 0.1, 0.3, 0.5, 0.75, 0.85, and 0.95. Performance of the power of adaptive weight on penalized approaches was evaluated in term of the mean of the predicted mean squared error for simulation study and the classification accuracy of machine-learning model for real-data applications. Results: The results presented that the higher-order of the adaptive Lasso approach performed best under very high-dimensional sparse data with multicollinearity when the initial weight was determined using a ridge estimator. However, in the case of high-dimensional sparse data with multicollinearity, the square root of the adaptive Lasso together with the initial weight using Lasso was the best option. Conclusion: Our finding showed that the power of adaptive weight on penalty function and the initial weight can affect certain the classification accuracy of machine-learning model. In practice, if choosing these parameters are appropriate, it produces models that have good performance. https://jbe.tums.ac.ir/index.php/jbe/article/view/1357