Original Article

Statistical Considerations in Combining Multiple Biomarkers for Diagnostic Classification

Logistic Regression Risk Score versus Discriminant Function Score


Introduction: In clinical practices, multiple biomarkers are frequently used on the same subjects for diagnosis of an adverse outcome. This study compares two alternative multiple linear regression approaches as the logistic regression model and the discriminant function score in combing several markers.

Methods: Ten thousand simulated data sets were generated from binormal and non-binormal pairs of distributions with different sample sizes and correlation structures. Each dataset underwent a logistic regression and the discriminant analysis simultaneously. The ROC analysis was performed with each marker alone and also their combining scores. For two alternative approaches, the average of AUC and its root mean square error (RMSE) were estimated over 10000 replications trials for all configurations and sample sizes used. The practical utility of the two methods is further illustrated with a clinical example of real data as well.

Results: The two approaches yielded identical accuracy in particular with binormal data. With non- binormal data, the logistic regression risk score produced an equal or a slightly better accuracy than the discriminate function score.

Conclusion: Overall, the two approaches yield rather identical results. However, adopting the logistic regression model may incorporate slightly better accuracy index than discriminant analysis with non-binormal data.

1. Hajian-Tilaki K. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Caspian J Intern Med 2013;4(2): 627-35.
2. Kumar R, Indrayan A. Receiver operating characteristic (ROC) curve for medical researchers. Indian Pediatr 2011; 48(4):277-87.
3. Hanley JA. Receiver operating characteristic (ROC) methodology: the state of the art. Crit Rev Diagn Imaging 1989; 29:307-335.
4. Hajian-Tilaki K. The choice of methods in determining the optimal cut-off value for quantitative diagnostic test evaluation Stat Methods Med Res. 2018; 27(8):2374-2383.
5. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143:29-36.
6. Li X, Lu J, Ren H, Chen T, Cao L, Di L, et al. Combining multiple serum biomarkers in tumor diagnosis: A clinical assessment, Molecular and Clinical Oncol 2013;1:153-160.
7. Jiang R, Dong X, Zhu W, Duan Q, Xue Y, Shen Y, et al. Combining RET/CT with serum tumor markers to improve the evaluation of histological type of suspicious lung cancer, PLOS One 2017;12(9):e0184338.
8. Yuan Z, Ghosh D. Combining multiple biomarker models in logistic regression, Biometrics 2008;64: 431-439.
9. Pep MS, Thompson ML. Combining diagnostic test results to increase accuracy. Biostatistics 2000;1(2):123-140.
10. Cho S-Y, Choi J-H. Biomarkers of Sepsis. Infect Chemother 2014; 46(1):1-12.
11. Selberg O, Hecker H, Martin M, Klos A, Bautsch W, Köhl J. Discrimination of sepsis and systemic inflammatory re¬sponse syndrome by determination of circulating plasma concentrations of procalcitonin, protein complement 3a, and interleukin-6. Crit Care Med 2000; 28:2793-8.
12. Kofoed K, Eugen-Olsen J, Petersen J, Larsen K, Andersen O. Predicting mortality in patients with systemic inflamma¬tory response syndrome: an evaluation of two prognostic models, two soluble receptors, and a macrophage migra¬tion inhibitory factor. Eur J Clin Microbiol Infect Dis 2008; 27:375-83.
13. Harbarth S, Holeckova K, Froidevaux C, Pittet D, Ricou B, Grau GE, Vadas L, Pugin J. Diag¬nostic value of procalcitonin, interleukin-6, and interleu¬kin-8 in critically ill patients admitted with suspected sep¬sis. Am J Respir Crit Care Med 2001; 164:396-402.
14. Zhang F, Deng Y, Drabier R. Multiple biomarker panels for early detection of breast cancer in peripheral blood. Bio Med Res Int. 2013, available at: http://dx.doi.org/10.1155/2013/781618.
15. Li X, Lu J, Ren H, Chen T, Cao L, Di L, et al. Combining multiple serum biomarkers in tumor diagnosis: A clinical assessment, Molecular and Clinical Oncol 2013; 1:153-160.
16. Kim Y-S, Jang M-K, Park CY, Song HJ, Kim JD. Exploring multiple biomarker combination by logistic regression for early screening of ovarian cancer. Int J Bio-Sci&Bio-Techno 2013; 5(2):67-73.
17. Yu W, Park. Two simple algorithms on linear combination of multiple biomarkers to maximize partial area under ROC curve. Computational Statistics & Data Analysis 2014; 88:15-27.
18. Jiang S-Q, Liu Q. Application of logistic regression in combination with multiple diagnostic tests for auxiliary diagnosis of nasopharyngeal carcinoma, Chinese Journal of Cancer 2009;28:2, 177-180.
19. Antonogeorgos G, Panagiotakos DB, Priftis KN, Logistic regression and linear discriminant analysis in evaluating factors associated with asthma prevalence among 10- to 12-years-old children: divergence and similarity of the two statistical methods. Int J Pediatr. 2009, doi:10.1155/2009/952042.
20. Yoon HI, Known O-R, Kang KN, Shin YS, Shin HS, Yeon EH et al. Diagnostic value of combining tumor and inflammatory markers in lung cancer, J Cancer Prev 2016; 21:187-193.
21. Hajian-Tilaki K, Hanley JA, Nassiri V. An extension of parametric ROC analysis for calculating diagnostic accuracy when underlying distributions are mixture of Gaussian, J Appl Stat 2011; 38(9):2009-2022.
22. Moein S, Qujeq D, Vaghari Tabari M, Kashifard M, Hajian-Tilaki K. Diagnostic accuracy of fecal calprotectin in assessing the severity of inflammatory bowel disease: From laboratory to clinic. Caspian Journal of Internal Medicine 2017; 8(3):178-182.
23. Hosmer DW, Lemeshow, JS Sturdivant RX. Applied logistic regression, third edition, John Wiley & sons, Inc. New York, 2000.
24. Mamtani MR, Thakre TR, Kalkonde MY, Amin MA, Kalkonde YV, Amin AP, et al. A simple method to combine multiple molecular for dichotomous diagnostic classification. BMC Bioinformatics 2006; 7:442.
25. Yurkovetsky Z, Ta'asan S, Skates S, Rand A, lomakin A, Linkov F, et al. Development of multimarker panel for early detection of endometrial cancer. High diagnostic power of prolactin, GynecolOncol 2007; 107(1):58-65.
26. Zapata-Vazquez RE, Rodriguez-Cavajal LA, Sierra Basto G, Alnozo-Vazquez FM, Echevwrriluz M. Discriminant function of perinatal risk that predicts early neonatal morbidity: Its validity and reliability. Arch Med Res 2003; 34:214-221.
27. Feng Z. Classification versus association models: Should the same methods apply? Scand J Clin Lab Invest Supl 2010;242:53-58.
28. Fong Y, Yin S, Huang Y. Combining biomarkers linearly and nonlinearly for classification using the area under the ROC curve. Stat Med. 2016 Apr 5. doi:10.1002/sim.6956.
29. Yin J, Tian I. Optimal linear combinations of multiple diagnostic biomarkers based on Youden index, Stat Med 2014;33(8):1426-4
IssueVol 8 No 2 (2022) QRcode
SectionOriginal Article(s)
Logistic regression model, discriminant function score, ROC analysis, area under the curve (AUC), combining multiple biomarkers

Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
How to Cite
Hajian-tilaki K, Geraili Z, Nassiri V. Statistical Considerations in Combining Multiple Biomarkers for Diagnostic Classification. JBE. 2022;8(2):138-151.