A Multi-Method Comparison of Machine Learning in predicting pharmacokinetic parameters: A simulation study
Abstract
Background: One important aim of population pharmacokinetics (PK) and pharmacodynamics (PD) is the identification and quantification of the relationships between the parameter and covariates to improve the predictive performance of the population PK/PD modeling. Several new mathematical methods have been developed in pharmacokinetics in recent years which indicated that the machine learning-based methods are an appealing tool for analyzing PK/PD data.
Methods: This simulation-base study aims to determine whether machine learning methods, including support vector regression (SVR) and Random forest (RF) which are specifically designed for the prediction of blood serum concentration or clearance, could be an effective replacement for the Lasso covariate selection method in nonlinear mixed effect models. Accordingly, the predictive performance of penalized regression Lasso, SVR, and RF regression was compared to detect the associations between clearance and model covariates. PK data was simulated from a one-compartment model with oral administration. Covariates were created by sampling from a multivariate standard normal distribution with different levels of correlation. The true covariates influenced only clearance at different magnitudes. Lasso, RF, and SVR were compared in terms of mean absolute prediction error(MAE).
Results: The results show that SVR performed the best in small data sets, even in those in which a high correlation existed between covariates. This makes SVR a promising method for covariate selection in nonlinear mixed-effect models.
Conclusion: The Lasso method offered a higher MAE, making it less promising than RF and SVR, especially when dealing with a high correlation between covariates and a low number of individuals.
1. Steyerberg EW, Eijkemans MJC, Habbema JDF. Stepwise Selection in Small Data Sets: A Simulation Study of Bias in Logistic Regression Analysis. Journal of Clinical Epidemiology. 1999;52(10):935-42.
2. Ribbing J, Nyberg J, Caster O, Jonsson EN. The lasso—a novel method for predictive covariate model building in nonlinear mixed effects models. Journal of Pharmacokinetics and Pharmacodynamics. 2007;34(4):485-517.
3. Seok KH, Shim J, Cho D, Noh G-J, Hwang C. Semiparametric mixed-effect least squares support vector machine for analyzing pharmacokinetic and pharmacodynamic data. Neurocomputing. 2011;74(17):3412-9.
4. Durisová M, Dedík L. New mathematical methods in pharmacokinetic modeling. Basic & clinical pharmacology & toxicology. 2005;96(5):335-42.
5. Haem E, Harling K, Ayatollahi SMT, Zare N, Karlsson MO. Adjusted adaptive Lasso for covariate model-building in nonlinear mixed-effect pharmacokinetic models. Journal of Pharmacokinetics and Pharmacodynamics. 2017;44(1):55-66.
6. Chow H-H, Tolle KM, Roe DJ, Elsberry V, Chen H. Application of Neural Networks to Population Pharmacokinetic Data Analysis. Journal of Pharmaceutical Sciences. 1997;86(7):840-5.
7. Poynton M, Choi B, Kim Y, Park I, Noh G, Hong S, et al. Machine learning methods applied to pharmacokinetic modelling of remifentanil in healthy volunteers: a multi method comparison. Journal of International Medical Research. 2009;37(6):1680-91.
8. Longjun D, Xibing L, Ming X, Qiyue L. Comparisons of Random Forest and Support Vector Machine for Predicting Blasting Vibration Characteristic Parameters. Procedia Engineering. 2011;26:1772-81. 108 Vol 10 No 1 (2024) A Multi-Method Comparison of Machine Learning in predicting ... Doostfatemeh M et al.
9. Tolle KM, Chen H, Chow H-H. Estimating drug/plasma concentration levels by applying neural networks to pharmacokinetic data sets. Decision Support Systems. 2000;30(2):139-51.
10. Vapnik V, Vapnik V. Statistical learning theory Wiley. New York. 1998;1:624.
11. Dreiseitl S, Ohno-Machado L, Kittler H, Vinterbo S, Billhardt H, Binder M. A Comparison of Machine Learning Methods for the Diagnosis of Pigmented Skin Lesions. Journal of Biomedical Informatics. 2001;34(1):28-36.
12. GÜler NF, Koçer S. Use of Support Vector Machines and Neural Network in Diagnosis of Neuromuscular Disorders. Journal of Medical Systems. 2005;29(3):271-84.
13. Wei L, Yang Y, Nishikawa RM, Jiang Y. A study on several machine-learning methods for classification of malignant and benign clustered microcalcifications. IEEE transactions on medical imaging. 2005;24(3):371-80.
14. Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological). 1996;58(1):267-88.
15. Breiman L. Random forests. Machine learning. 2001;45(1):5-32.
16. Smola AJ, Schölkopf B. A tutorial on support vector regression. Statistics and computing. 2004;14(3):199-222.
17. Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V, editors. Feature selection for SVMs. Advances in neural information processing systems; 2001.
18. Drucker H, Burges CJ, Kaufman L, Smola AJ, Vapnik V, editors. Support vector regression machines. Advances in neural information processing systems; 1997.
19. Yap CW, Li ZR, Chen YZ. Quantitative structure–pharmacokinetic relationships for drug clearance by using statistical learning methods. Journal of Molecular Graphics and Modelling. 2006;24(5):383-95.
20. DUREJA H, GUPTA S, MADAN AK. Topological Models for Prediction of Pharmacokinetic Parameters of Cephalosporins using Random Forest, Decision Tree and Moving Average Analysis. Scientia Pharmaceutica. 2008;76(3):377-94.
21. Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, et al. QSAR Modeling: Where Have You Been? Where Are You Going To? Journal of Medicinal Chemistry. 2014;57(12):4977-5010.
22. Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP. Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling. Journal of Chemical Information and Computer Sciences. 2003;43(6):1947-58.
23. Lombardo F, Obach RS, DiCapua FM, Bakken GA, Lu J, Potter DM, et al. A Hybrid Mixture Discriminant Analysis− Random Forest Computational Model for the Prediction of Volume of Distribution of Drugs 109 Vol 10 No 1 (2024) A Multi-Method Comparison of Machine Learning in predicting ... Doostfatemeh M et al. in Human. Journal of Medicinal Chemistry. 2006;49(7):2262-7.
24. Ziegler A, König IR. Mining data with random forests: current options for real world applications. WIREs Data Mining and Knowledge Discovery. 2014;4(1):55-63.
25. R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
26. Fidler M. nlmixr: an R package for population PKPD modeling. 2019.
27. Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F, Chang C-C, et al. Package ‘e1071’. The R Journal. 2019.
28. Liaw A, Wiener M. The randomforest package. R news. 2002;2(3):18-22.
29. Krämer N, Schäfer J, Boulesteix A-L. Regularized estimation of large-scale gene association networks using graphical Gaussian models. BMC Bioinformatics. 2009;10(1):384.
30. Savin I. A comparative study of the lasso-type and heuristic model selection methods. Jahrbücher für Nationalökonomie und Statistik. 2013;233(4):526-49.
31. Lu F, Petkova E. A comparative study of variable selection methods in the context of developing psychiatric screening instruments. Statistics in medicine. 2014;33(3):401-21.
32. Zhu X-W, Xin Y-J, Ge H-L. Recursive random forests enable better predictive performance and model interpretation than variable selection by LASSO. Journal of chemical information and modeling. 2015;55(4):736-46.
33. Xie Z-X, Hu Q-H, Yu D-R, editors. Improved feature selection algorithm based on SVM and correlation. International symposium on neural networks; 2006: Springer.
34. Song X, Halgamuge SK, Chen D, Hu S, Jiang B. The optimized support vector machine with correlative features for classification of natural spearmint essence. International Journal of Innovative Computing, Information and Control. 2010;6(3):1089-99.
35. Hassan SS, Farhan M, Mangayil R, Huttunen H, Aho T. Bioprocess data mining using regularized regression and random forests. BMC systems biology. 2013;7(1):1-7.
36. Bonate PL. The effect of collinearity on parameter estimates in nonlinear mixed effect models. Pharmaceutical research. 1999;16(5):709-17.
37. Hoggart CJ, Whittaker JC, De Iorio M, Balding DJ. Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies. PLoS Genet. 2008;4(7):e1000130.
38. Schelldorfer J, Meier L, Bühlmann P. Glmmlasso: an algorithm for high-dimensional generalized linear mixed models using ℓ1- penalization. Journal of Computational and Graphical Statistics. 2014;23(2):460-77.
39. Kang SH, Poynton MR, Kim KM, Lee H, Kim DH, Lee SH, et al. Population pharmacokinetic and pharmacodynamic models of remifentanil in healthy volunteers 110 Vol 10 No 1 (2024) A Multi-Method Comparison of Machine Learning in predicting ... Doostfatemeh M et al. using artificial neural network analysis. British Journal of Clinical Pharmacology. 2007;64(1):3-13.
40. Combes F, Retout S, Frey N, Mentré F. Powers of the likelihood ratio test and the correlation test using empirical Bayes estimates for various shrinkages in population pharmacokinetics. CPT: pharmacometrics & systems pharmacology. 2014;3(4):1-9.
41. Tessier A, Bertrand J, Chenel M, Comets E. Combined Analysis of Phase I and Phase II Data to Enhance the Power of Pharmacogenetic Tests. CPT: pharmacometrics & systems pharmacology. 2016;5(3):123-31
Files | ||
Issue | Vol 10 No 1 (2024) | |
Section | Articles | |
DOI | https://doi.org/10.18502/jbe.v10i1.17156 | |
Keywords | ||
Population pharmacokinetics; Machine Learning; Lasso; Random forest; Support vector regression |
Rights and permissions | |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |