Journal of Biostatistics and Epidemiology
https://jbe.tums.ac.ir/index.php/jbe
Tehran University of Medical Sciencesen-USJournal of Biostatistics and Epidemiology2383-4196DeepWei-Cu: A Deep Weibull Network for Cure Fraction Models
https://jbe.tums.ac.ir/index.php/jbe/article/view/1425
<p><strong>Introduction:</strong> Survival analysis including cure fraction subgroups is heavily used in different fields like economics, engineering and medicine. The main core of the analysis is to understand the relationship between the covariates and the survival function taking into consideration censoring and long-term survival. The analysis can be performed using traditional statistical models or neural networks. Recently, neural network has attracted attention in analyzing lifetime data due to its ability of efficiently estimating the survival function under the existence of complex covariates. To the best of our knowledge, this is the first time a parametric neural network is introduced to analyze mixture cure fraction models.</p> <p><strong>Methods:</strong> In this paper, we introduce a novel neural network based on mixture cure fraction Weibull loss function.</p> <p><strong>Results: </strong>Alzheimer disease dataset as long as synthetic dataset are used to study the efficiency of the model. We compared the results using goodness of fit methods in both datasets with Weibull regression.</p> <p><strong>Conclusion: </strong>The proposed neural network has the flexibility of analyzing continuous data without discretization. Also, it has the advantage of using Weibull distribution properties. For example, it can analyze data with different hazard rates (monotonically decreasing, monotonically increasing and constant). comparing the results with Weibull regression, the proposed neural network performed better.</p>Ola Abuelamayem
##submission.copyrightStatement##
2024-12-012024-12-01101566310.18502/jbe.v10i1.17153Inference on stress-strength reliability based on progressively type-II censored data from two-parameter exponential distribution
https://jbe.tums.ac.ir/index.php/jbe/article/view/1431
<p><strong>Introduction:</strong> Stress-strength models has achieved considerable attention in recent years due to its applicability in various areas like engineering, quality control, biology, genetics, medicine etc. This paper investigates estimation of the stress-strength reliability parameter in two-parameter exponential distributions under progressively type-II censored samples.</p> <p><strong>Methods:</strong> The maximum likelihood and the best linear unbiased estimates of are obtained, and the Bayes estimates of are computed under the squared error, linear-exponential, and Stein loss functions. Also, confidence intervals of stress-strength reliability such as the bootstrap confidence intervals, highest posterior density credible interval, and confidence interval based on the generalized pivotal quantity are obtained. <strong>Results:</strong> Using a simulation study, the point estimators and confidence intervals are evaluated and compared. A set of real data is presented for better clarification of the issue.</p> <p><strong>Conclusion: </strong>The results demonstrated that with increasing the sample size, in almost cases the ERs of all the estimators decrease. Also, in almost all cases the Bayes estimator under the LINEX loss function has smaller ER than the other estimators. Based on our simulation, the ELs of all intervals tend to decrease when the sample size increases. Moreover, the HPD confidence intervals are shorter than the others intervals for all the values of .</p> <p><strong> </strong></p> <p> </p>Sajad Rostamian
##submission.copyrightStatement##
2024-12-012024-12-01101648110.18502/jbe.v10i1.17154Mining hypertension predictors using decision tree: Baseline data of Kharameh cohort study
https://jbe.tums.ac.ir/index.php/jbe/article/view/1432
<p><strong>Abstract</strong></p> <p><strong>Background: </strong>Hypertension is a serious chronic disease and an important risk factor for many health problems. this study aimed to investigate the factors associated with hypertension using a decision-tree algorithm.</p> <p><strong>Methods</strong>: this cross-sectional study was conducted in Kharameh city between 2014-2017 through census. The study included 2510 hypertensive and 7840 non-hypertensive individuals. 70% of the cases were randomly allocated to the training dataset for establishing the decision tree, while the remaining 30% were used as the testing dataset for performance evaluation of the decision-tree. Two models were assessed. In the first model (model I), 15 variables including age, gender, body mass index, years of education, Occupation status, marital status, family history of hypertension, physical activity, total energy, number of meals, salt, oil type, drug use, alcohol use and smoke entered in to the model. in the second model (model II) 16 variables including age, gender, BMI and Blood factors as HCT, MCHC, PLT, FBS, BUN, CERAT, TG, CHOL, ALP, HDL, GGT, LDL and SG were considered. a receiver operating characteristic (ROC) curve was applied to assess the validation of the models.</p> <p> </p> <p><strong>Results</strong>: The accuracy, sensitivity, specificity, and area under the ROC curve (AUC) are important metrics to evaluate the performance of a decision tree model. For model I, the accuracy, sensitivity, specificity and area under the ROC curve (AUC) value were 79.24%, 82.41%, 78.24% and 0.80, respectively. for model II, the corresponding values were 79.50%, 81.03%, 79.02% and 0.80, respectively.</p> <p> </p> <p><strong>Conclusion</strong>: We have suggested a decision tree model to identify the risk factors associated with hypertension. This model can be useful for early screening and improving preventive and curative health services in health promotion.</p>abbas RezaianzadehSamane Nematolahimaryam jalaliShayan RezaeianzadehMasoumeh Ghoddusi JohariSeyed Vahid Hosseini
##submission.copyrightStatement##
2024-12-012024-12-01101829710.18502/jbe.v10i1.17155A Multi-Method Comparison of Machine Learning in predicting pharmacokinetic parameters: A simulation study
https://jbe.tums.ac.ir/index.php/jbe/article/view/1448
<p><strong>Background:</strong> One important aim of population pharmacokinetics (PK) and pharmacodynamics (PD) is the identification and quantification of the relationships between the parameter and covariates to improve the predictive performance of the population PK/PD modeling. Several new mathematical methods have been developed in pharmacokinetics in recent years which indicated that the machine learning-based methods are an appealing tool for analyzing PK/PD data.</p> <p><strong>Methods:</strong> This simulation-base study aims to determine whether machine learning methods, including support vector regression (SVR) and Random forest (RF) which are specifically designed for the prediction of blood serum concentration or clearance, could be an effective replacement for the Lasso covariate selection method in nonlinear mixed effect models. Accordingly, the predictive performance of penalized regression Lasso, SVR, and RF regression was compared to detect the associations between clearance and model covariates. PK data was simulated from a one-compartment model with oral administration. Covariates were created by sampling from a multivariate standard normal distribution with different levels of correlation. The true covariates influenced only clearance at different magnitudes. Lasso, RF, and SVR were compared in terms of mean absolute prediction error(MAE).</p> <p><strong>Results:</strong> The results show that SVR performed the best in small data sets, even in those in which a high correlation existed between covariates. This makes SVR a promising method for covariate selection in nonlinear mixed-effect models<strong>.</strong></p> <p><strong>Conclusion:</strong> The Lasso method offered a higher MAE, making it less promising than RF and SVR, especially when dealing with a high correlation between covariates and a low number of individuals.</p>marziyeh Doostfatemehkamal AminiElham Haem
##submission.copyrightStatement##
2024-12-012024-12-011019811010.18502/jbe.v10i1.17156Bounded multivariate contaminated normal mixture model with applications to skin cancer detection
https://jbe.tums.ac.ir/index.php/jbe/article/view/1453
<p><strong>Background & Aim:</strong> In real-world datasets, outliers are a common occurrence that can have a significant impact on the accuracy and reliability of statistical analyses. Detecting these outliers and developing robust models to handle their presence is a crucial challenge in data analysis. For instance, natural images may have complex distributions of values due to environmental factors like noise and illumination, resulting in objects with overlapping regions and non-trivial contours that cannot be accurately described by Gaussian mixture models. In many real life applications, observed data always fall in bounded support regions. This leads to the idea of bounded support mixture models. Motivated by the aforementioned observations, we introduce a bounded multivariate cntaminated normal distribution for fitting data with non-Gaussian distributions, asymmetry, and bounded support which makes finite mixture models more robust to fitting, since rare observations are given less importance in calculations.</p> <p><strong>Methods & Materials:</strong> A family of finite mixtures of bounded multivariate contaminated normal distributions is introduced. The model is well-suited for computer vision and pattern recognition problems due to its heavily-tailed and bounded nature, providing flexibility in modeling data in the presence of outliers. A feasible expectation-maximization algorithm is developed to compute the maximum likelihood estimates of the model parameters using a selection mechanism.</p> <p><strong>Results:</strong> The proposed methodology is validated by conducting experiments on two real natural skin cancer images. We estimate the parameters by the proposed expectation-maximization algorithm. The obtained results shown that the proposed model showed that the proposed method has successfully enhanced accuracy in segmenting skin lesions.</p> <p><strong>C</strong><strong>onclusion:</strong> The reliable model-based clustering using finite mixtures of bounded multivariate contaminated normal distributions is introduced. An expectation-maximization algorithm was created to estimate parameters, with closed-form expressions utilized at the E-step. Practical tests on images for skin cancer detection showed enhanced accuracy in delineating skin lesions.</p>Abbas Mahdavi
##submission.copyrightStatement##
2024-12-012024-12-0110111112310.18502/jbe.v10i1.17157