Original Article

Assessing Diagnostic Accuracy of Doctors Without a Gold Standard using Bayesian Networks and Kmodes Dlustering Algorithm

Abstract

Background & Aim: The diagnostic accuracy of a test is the ability to discriminate accuratelybetween patients who have and do not have the target disease. A common problem in assessing thediagnostic accuracy of doctors is the unknown true disease status which in the literature is referredas “absence of a gold standard”.
Methods & Material: In this article, a Naïve Bayesian network with hidden class node and a clusteringbased algorithm for categorical data named K-modes are proposed for estimating the diagnosticaccuracy of 5 physicians in diagnosing Diabetic Retinopathy. Also to assess and compare the efficiencies of these models, a simulation study with two different scenarios is conducted.
Results: Simulation study indicates that for Naïve Bayesian network and the non-rare disease, say forprevalence 0.1 and 0.2, as the sample size increases so the coverage probability. But for high prevalencevalues, say 0.5, coverage probabilities are not as good as those of non-rare disease. K-modes algorithm's efficiency decreases by the increase in the number of records, but it achieves betterresults when there are a small number of records, prevalence is approximately 0.3 and sensitivitiesare high. Results of the real data set reveal that sensitivities for all physicians except one, were higher than 85% and all specificities were higher than 90%. Also the estimated prevalence happensto be 0.32.
Conclusion: Through simulations and data analysis we show that this new approach based on Naïve Bayesian networks provides a useful alternative to traditional latent class modeling approaches usedin this setting.

1. Gelaye B, Tadesse MG, Williams MA, Fann JR, Stoep AV, Zhou XHA. Assessing validityof a depression screening instrument in the absence of a gold standard. Annals of Epidemiology. 2014;24(7):527-531.
2. Reitsma JB, Rutjes AWS, Khan KS, Coomarasamy A, Bossuyt PM. A review of solutionsfor diagnostic accuracy studies with an imperfect or missing reference standard. Journal ofClinical Epidemiology. 2009;62(8):797-806.
3. van Smeden M, Naaktgeboren CA, Reitsma JB, Moons KGM, de Groot JAH. Latent ClassModels in Diagnostic Studies When There is No Reference Standard: A Systematic Review. American Journal of Epidemiology. 2014;179(4):423-431.
4. Kaufman L, Rousseeuw PJ. Finding Groups in Data: an introduction to cluster analysis.Wiley; 1990.
5. Everitt BS. Cluster Analysis. 3rd ed. John Wiley and Sons Inc; 1993.
6. Lazarsfeld PF, Henry NW. Latent structure analysis. Houghton, Mifflin; 1968.
7. Bartholomew D. Latent Variable Models and Factor Analysis. A Unified Approach. 3rded.Chichester: Wiley; 2011.
8. Huang Z. Extensions to the k-Means Algorithm for Clustering Large Data Sets withCategorical Values. Data Min KnowlDiscov. 1998 Sep;2(3):283-304.
9. Huang Z. A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in DataMining. In: DMKD; 1997.
10. Hui SL, Zhou XH. Evaluation of diagnostic tests without gold standards. Statistical MethodsinMedical Research. 1998;7(4):354370. PMID: 9871952.
11. Duda RO, Hart PE. Pattern Classification and Scene Analysis. A Wiley IntersciencePublication.Wiley; 1973.
12.Langseth H, Nielsen TD. Classification using Hierarchical Naïve Bayes models. Machine Learning. 2006 May;63(2):135-159.
13. Elidan G, Friedman N. Learning Hidden Variable Networks: The Information BottleneckApproach. Journal of Machine Learning Research. 2005;6:81-127.
14. Zhang NL. Hierarchical Latent Class Models for Cluster Analysis. J Mach Learn Res. 2004;5:697-723.
15. Dodge Y. The Oxford Dictionary of Statistical Terms. Oxford University Press; 2006.
16. Canty A, Ripley BD. boot: Bootstrap R (SPlus) Functions; 2017. R package version 1.320.
17. Davison AC, Hinkley DV. Bootstrap Methods and Their Applications. Cambridge: CambridgeUniversity Press; 1997. ISBN 0521-57391-2.
Files
IssueVol 4 No 4 (2018) QRcode
SectionOriginal Article(s)
Keywords
Bayesian networks Cluster Analysis Diabetic Retinopathy Humans Sensitivity

Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
How to Cite
1.
Niloofar P, Niloofar P, Yaseri M. Assessing Diagnostic Accuracy of Doctors Without a Gold Standard using Bayesian Networks and Kmodes Dlustering Algorithm. JBE. 2019;4(4):184-195.