Sparse Variable Selection in Competing Risks Additive Hazards Regression: An application for identifying biomarkers related to prognosis of Bladder Cancer
Abstract
Introduction: Variable selection has become an increasingly important topic in biomedical research, as evidenced by its modern applications in high-throughput genomic data analysis. Specifically, interest in analyzing high-throughput data to link gene expression profiles to the timing of an event such as death has grown, with the goal of evaluating the influence of biomedical variables on survival outcomes. One common special case in survival data is competing risks data where identifying a small subset of gene expression profiles related to cumulative incidence function (CIF) is crucial.
Methods: Several methods for directly modeling CIF are proposed, involving modeling the subdistribution hazard function of the interested cause or event using the proportional hazards approach. We proposed a regularized method for variable selection in the additive subdistribution hazards model by combining the nonconcave penalized likelihood approach and the pseudoscore method. We also conducted Monte Carlo simulations to evaluate the performance of our proposed method. In addition, a publicly available dataset was used to illustrate the proposed model.
Results: Results from simulation studies were presented together with an application to genomic data when the endpoint is progression-free survival and the objective is to identify genes related to CIF of bladder cancer in the presence of competing events. Five genes in common (CDC20, PLEK, FCN2, IGF1R and DCTD) were identified by the proposed penalized additive subdistribution hazards model with different penalties.
Conclusions: Monte Carlo simulation studies results suggested that the results of all penalties were comparable in terms of sensitivity and specificity, whereas those based on Adaptive Elastic Net (AENET) and Adaptive Least Absolute Shrinkage and Selection Operator (ALASSO) penalties tended to perform better in terms of estimation accuracy.
1. Ambrogi F, Scheike THJB. Penalized estimation for competing risks regression with applications to high-dimensional covariates. Biostatistics. 2016;17(4):708-21.
2. Binder H, Allignol A, Schumacher M, Beyersmann J. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics. 2009;25(7):890-6.
3. Gaïffas S, Guilloux A. High-dimensional additive hazards models and the lasso. Electronic Journal of Statistics. 2012;6:522-46.
4. Lin W, Lv J. High-dimensional sparse additive hazards regression. Journal of the American Statistical Association. 2013;108(501):247-64.
5. Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology. 1996;58(1):267-88.
6. Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association. 2001;96(456):1348-60.
7. Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology. 2005;67(2):301-20.
8. Fan J, Lv J. A selective overview of variable selection in high dimensional feature space. Statistica Sinica. 2010;20(1):101.
9. Tibshirani R. The lasso method for variable selection in the Cox model. Statistics in Medicine. 1997;16(4):385-95.
10. Zhang HH, Lu W. Adaptive Lasso for Cox's proportional hazards model. Biometrika. 2007;94(3):691-703.
11. Lin D, Ying Z. Semiparametric analysis of the additive risk model. Biometrika. 1994;81(1):61-71.
12. Martinussen T, Scheike TH. Covariate selection for the semiparametric additive risk model. Scandinavian Journal of Statistics. 2009;36(4):602-19.
13. Zhang H, Sun L, Zhou Y, Huang J. Oracle inequalities and selection consistency for weighted Lasso in high-dimensional additive hazards model. Statistica Sinica. 2017;27(4):1903-20.
14. Liu L, Su W, Zhao X. Bi-selection in the high-dimensional additive hazards regression model. Electronic Journal of Statistics. 2021;15(1):748-72.
15. Ma S, Huang J. Additive risk survival model with microarray data. BMC Bioinformatics. 2007;8(1):1-10.
16. Eriksson F, Li J, Scheike T, Zhang MJJB. The proportional odds cumulative incidence model for competing risks. Biometrics. 2015;71(3):687-95.
17. Fine JP, Gray RJJJotAsa. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association. 1999;94(446):496-509.
18. Sun L, Liu J, Sun J, Zhang MJSS. Modeling the subdistribution of a competing risk. Statistica Sinica. 2006;16(4):1367.
19. Zheng C, Dai R, Hari PN, Zhang MJJSim. Instrumental variable with competing risk model. Statistics in Medicine. 2017;36(8):1240-55.
20. Scheike TH, Zhang M-JJLda. Flexible competing risks regression modeling and goodness-of-fit. Lifetime Data Analysis. 2008;14(4):464.
21. Scheike TH, Zhang M-J, Gerds TAJB. Predicting cumulative incidence probability by direct binomial regression. Biometrika. 2008;95(1):205-20.
22. Tapak L, Saidijam M, Sadeghifar M, Poorolajal J, Mahjub HJG, proteomics, bioinformatics. Competing risks data analysis with high-dimensional covariates: an application in bladder cancer. Genomics, proteomics & Bioinformatics. 2015;13(3):169-76.
23. Fu Z, Parikh CR, Zhou B. Penalized variable selection in competing risks regression. Lifetime Data Analysis. 2017;23:353-76.
24. Kawaguchi ES, Shen JI, Suchard MA, Li G. Scalable algorithms for large competing risks data. Journal of Computational and Graphical Statistics. 2021;30(3):685-93.
25. Tapak L, Mahjub H, Sadeghifar M, Saidijam M, Poorolajal JJIjoph. Predicting the survival time for bladder cancer using an additive hazards model in microarray data. Iranian Journal of Public Health. 2016;45(2):239.
26. Dixon SN, Darlington GA, Desmond AF. A competing risks model for correlated data based on the subdistribution hazard. Lifetime Data Analysis. 2011;17(4):473-95.
27. Zou H. The adaptive lasso and its oracle properties. Journal of the American Statistical Association. 2006;101(476):1418-29.
28. Zou H, Zhang HH. On the adaptive elastic-net with a diverging number of parameters. Annals of Statistics. 2009;37(4):1733.
29. Beyersmann J, Latouche A, Buchholz A, Schumacher M. Simulating competing risks data in survival analysis. Statistics in Medicine. 2009;28(6):956-71.
30. Dyrskjøt L, Zieger K, Real FX, Malats N, Carrato A, Hurst C, et al. Gene expression signatures predict outcome in non–muscle-invasive bladder carcinoma: a multicenter validation study. Clinical Cancer Research. 2007;13(12):3545-51.
31. Wang C, Li N, Diao H, Lu L. Variable selection through adaptive elastic net for proportional odds model. Japanese Journal of Statistics and Data Science. 2024;7(1):203-21.
32. Bradic J, Fan J, Jiang J. Regularization for Cox’s proportional hazards model with NP-dimensionality. Annals of Statistics. 2011;39(6):3092.
33. Hou J, Paravati A, Hou J, Xu R, Murphy J. High‐dimensional variable selection and prediction under competing risks with application to SEER‐Medicare linked data. Statistics in Medicine. 2018;37(24):3486-502.
34. Shen P, He X, Lan L, Hong Y, Lin M. Identification of cell division cycle 20 as a candidate biomarker and potential therapeutic target in bladder cancer using bioinformatics analysis. Bioscience Reports. 2020;40(7):BSR20194429.
35. Liu Y, Zou S-h, Gao X. Bioinformatics analysis and experimental validation reveal that CDC20 overexpression promotes bladder cancer progression and potential underlying mechanisms. Genes & Genomics. 2024;46(4):437-49.
36. Verma S, Shankar E, Lin S, Singh V, Chan ER, Cao S, et al. Identification of key genes associated with progression and prognosis of bladder cancer through integrated bioinformatics analysis. Cancers. 2021;13(23):5931.
37. Duan H, Yu S, Xia W, Wang C, Zhang S, Shen Y, et al. Prognostic implications of a four-gene signature in non-muscle invasive bladder cancer. 2023.
38. Zhu H, Chen H, Wang J, Zhou L, Liu SJO, therapy. Collagen stiffness promoted non-muscle-invasive bladder cancer progression to muscle-invasive bladder cancer. OncoTargets and Therapy. 2019;12:3441.
39. Yan P, He Y, Xie K, Kong S, Zhao W. In silico analyses for potential key genes associated with gastric cancer. PeerJ. 2018;6:e6092.
40. Vuong H, Cheng F, Lin C-C, Zhao Z. Functional consequences of somatic mutations in cancer using protein pocket-based prioritization approach. Genome Medicine. 2014;6:1-14.
41. Ke C, Bandyopadhyay D, Sarkar D. Gene Screening for Prognosis of Non-Muscle-Invasive Bladder Carcinoma under Competing Risks Endpoints. Cancers. 2023;15(2):379.
42. Xie J, Zhang H, Wang K, Ni J, Ma X, Khoury CJ, et al. M6A-mediated-upregulation of lncRNA BLACAT3 promotes bladder cancer angiogenesis and hematogenous metastasis through YBX3 nuclear shuttling and enhancing NCF2 transcription. Oncogene. 2023;42(40):2956-70.
43. Neuzillet Y, Chapeaublanc E, Krucker C, De Koning L, Lebret T, Radvanyi F, Bernard-Pierrot IJBc. IGF1R activation and the in vitro antiproliferative efficacy of IGF1R inhibitor are inversely correlated with IGFBP5 expression in bladder cancer. BMC Cancer. 2017;17(1):636.
44. Gonzalez-Roibon N, Kim JJ, Faraj SF, Chaux A, Bezerra SM, Munari E, et al. Insulin-like growth factor-1 receptor overexpression is associated with outcome in invasive urothelial carcinoma of urinary bladder: a retrospective study of patients treated using radical cystectomy. Urology. 2014;83(6):1444. e1-. e6.
45. Rochester MA, Patel N, Turney BW, Davies DR, Roberts IS, Crew J, et al. The type 1 insulin‐like growth factor receptor is over‐expressed in bladder cancer. BJU International. 2007;100(6):1396-401.
46. Faraj S, Gonzalez-Roibon N, Bezerra S, Munari E, Sharma R, Rezaei K, et al. MP28-10 IGF1R IMMUNOEXPRESSION IN SUPERFICIAL NON-MUSCLE INVASIVE UROTHELIAL CARCINOMA OF URINARY BLADDER. The Journal of Urology. 2014;191(4S):e300.
47. Hu H, Wang Z, Li M, Zeng F, Wang K, Huang R, et al. Gene expression and methylation analyses suggest DCTD as a prognostic factor in malignant glioma. Scientific reports. 2017;7(1):11568.
48. Ou Q, Lu Z, Cai G, Lai Z, Lin R, Huang H, et al. Unraveling the influence of metabolic signatures on immune dynamics for predicting immunotherapy response and survival in cancer. MedComm–Future Medicine. 2024;3(2):e89.
Files | ||
Issue | Vol 11 No 1 (2025): . | |
Section | Articles | |
Keywords | ||
Competing Risks; Subdistributions; Microarray; Additive hazards model; Variable selection; LASSO |
Rights and permissions | |
![]() |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |