Original Article

Comparison of Nearest Neighbor and Caliper Algorithms in Outcome Propensity Score Matching to Study the Relationship between Type 2 Diabetes and Coronary Artery Disease


Introduction: Propensity score matching (PSM) is a method to reduce the impact of essential and confounders. When the number of confounders is high, there may be a problem of matching, in which, finding matched pairs for the case group is difficult, or impossible. The propensity score (PS) minimizes the effect of the confounders, and it is reduced to one dimension. There are various algorithms in the field of PSM. This study aimed to compared the nearest neighbor and caliper algorithms. 

Methods: Data obtained in this study were from patients undergoing angiography at Ghaem Hospital in Mashhad, between 2011-12. The study was a retrospective case-control using PSM. In total, 604 patients were included in the case and control groups. A logistic regression model was used to calculate the propensity score and adjust the variables, such as age, gender, Body Mass Index (BMI), systolic blood pressure, smoking status, and triglyceride. Then, the Odds Ratios (ORs) with 95% Confidence Intervals (CIs) for the raw data and two matching algorithms were determined to examine the relationship between type 2 diabetes and coronary artery disease (CAD). 

Results: Propensity score in the nearest neighbor and caliper algorithms matched the total number of 604 samples, 200 and 178 pairs, respectively. All variables were significantly different between the two groups before matching (P<0.05). The gender was significantly different between the two groups after matching using the nearest neighbor algorithm (P=0.002). No variables created a significant difference between the two groups after matching with the caliper algorithm. 

Conclusion: Bias reduction in the caliper algorithm was greater than for the nearest neighbor algorithm for all variables except the triglyceride variable. 

1. Patsouras, A., et al., Screening and Risk Assessment of Coronary Artery Disease in Patients With Type 2 Diabetes: An Updated Review. in vivo, 2019. 33(4): p. 1039-1049.
2. Dehejia, R.H. and S. Wahba, Propensity score-matching methods for nonexperimental causal studies. Review of Economics and statistics, 2002. 84(1): p. 151-161.
3. Rothman, K.J., S. Greenland, and T.L. Lash, Modern epidemiology. 2008.
4. Allen, A.S. and G.A. Satten, Control for confounding in case-control studies using the stratification score, a retrospective balancing score. American journal of epidemiology, 2011. 173(7): p. 752-760.
5. Cochrane, W. and D. Rubin, Controlling bias in observational studies. Sankyha, 1973. 35(4): p. 417-446.
6. Austin, P.C., A comparison of 12 algorithms for matching on the propensity score. Statistics in medicine, 2014. 33(6): p. 1057-1069.
7. Austin, P.C., Propensity-score matching in the cardiovascular surgery literature from 2004 to 2006: a systematic review and suggestions for improvement. The Journal of thoracic and cardiovascular surgery,
2007. 134(5): p. 1128-1135. e3.
8. Austin, P.C., A critical appraisal of propensity‐score matching in the medical literature between 1996 and 2003. Statistics in medicine, 2008. 27(12): p. 2037-2049.
9. Austin, P.C., Optimal caliper widths for propensity‐score matching when estimating differences in means and differences in proportions in observational studies. Pharmaceutical statistics, 2011. 10(2): p. 150-161
10. Tajfard, M., et al., Anxiety, depression and coronary artery disease among patients undergoing angiography in Ghaem Hospital, Mashhad, Iran. Health, 2014. 6(11): p. 1108.
11. Golpour, P., et al., Comparison of Support Vector Machine, Naïve Bayes and Logistic Regression for Assessing the Necessity for Coronary Angiography. International Journal of Environmental Research and Public Health, 2020. 17(18): p. 6449.
12. Bowers, J., M. Fredrickson, and B. Hansen, RItools: Randomization inference tools. R package version 0.1-11, 2010.
13. Ho, D.E., et al., Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political analysis, 2007. 15(3): p. 199-236.
14. Agresti, A., An introduction to categorical data analysis. 2018: Wiley.
15. Tabesh, H., et al., Prevalence and trend of overweight and obesity among schoolchildren in Ahvaz, Southwest of Iran. Global journal of health science, 2014. 6(2): p. 35.
16. Lechner, M., A note on the common support problem in applied evaluation studies. Annales d'Économie et de Statistique, 2008: p. 217-235.
17. LaLonde, R.J., Evaluating the econometric evaluations of training programs with experimental data. The American economic review, 1986: p. 604-620.
18. Szekér, S. and Á. Vathy-Fogarassy, Weighted nearest neighbours-based control group selection method for observational studies. Plos one, 2020. 15(7): p. e0236531.
19. Rosenbaum, P.R. and D.B. Rubin, Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician, 1985. 39(1): p. 33-38.
20. Baser, O., Too much ado about propensity score models? Comparing methods of propensity score matching. Value in Health, 2006. 9(6): p. 377-385.
21. Cochran, W.G. and D.B. Rubin, Controlling bias in observational studies: A review. Sankhyā: The Indian Journal of Statistics, Series A, 1973: p. 417-446.
22. Pan, W. and H. Bai, Propensity score analysis: Fundamentals and developments. 2015: Guilford Publications.
23. Harrell Jr, F.E. and M.C. Dupont, The Hmisc Package. R Package, version, 2006: p. 2.0-0.
24. Rosenbaum, P.R., Observational studies, in Observational studies. 2002, Springer. p. 1-17.
25. Hasegawa, R. and D. Small, Sensitivity analysis for matched pair analysis of binary data: From worst case to average case analysis. Biometrics, 2017. 73(4): p. 1424-1432.
26. Luo, Z., J.C. Gardiner, and C.J. Bradley, Applying propensity score methods in medical research: pitfalls and prospects. Medical Care Research and Review, 2010. 67(5): p. 528-554.
27. Olmos, A. and P. Govindasamy, Propensity scores: a practical introduction using R. Journal of MultiDisciplinary Evaluation, 2015. 11(25): p. 68-88.
28. Austin, P.C., Statistical criteria for selecting the optimal number of untreated subjects matched to each treated subject when using many-to-one matching on the propensity score. American journal of epidemiology, 2010. 172(9): p. 1092-1097.
29. Austin, P.C., Some methods of propensity‐score matching had superior performance to others: results of an empirical investigation and Monte Carlo simulations. Biometrical Journal: Journal of Mathematical Methods in Biosciences, 2009. 51(1): p. 171-184.
30. Pirracchio, R., M. Resche-Rigon, and S. Chevret, Evaluation of the propensity score methods for estimating marginal odds ratios in case of small sample size. BMC medical research methodology, 2012. 12(1): p. 70.
31. Chun, S.-Y., et al., Do long term cancer survivors have better health-promoting behavior than non-cancer populations? Case control study in Korea. Asian Pac J Cancer Prev, 2015. 16(4): p. 1415-20.
32. Lee, H.S. and J.H. Lee, Vitamin D and urinary incontinence among Korean women: a propensity score-matched analysis from the 2008–2009 Korean National Health and Nutrition Examination Survey. Journal of Korean medical science, 2017. 32(4): p. 661-665.
IssueVol 7 No 3 (2021) QRcode
SectionOriginal Article(s)
DOI https://doi.org/10.18502/jbe.v7i3.7297
propensity score matching caliper algorithm nearest neighbor algorithm diabetes coronary artery disease.

Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
How to Cite
Sabbaghian Tousi S, Tabesh H, Saki A, Tagipour A, Tajfard M. Comparison of Nearest Neighbor and Caliper Algorithms in Outcome Propensity Score Matching to Study the Relationship between Type 2 Diabetes and Coronary Artery Disease. jbe. 2021;7(3):251-262.