Classifying Substance Abuse Tendencies Using the Naive Bayes Algorithm
Classifying Substance Abuse Tendencies Using the Naive Bayes Algorithm
Abstract
Abstract
Background
Uncertainty in human life often arises from a lack of knowledge based on past events or unrealized circumstances. The Naive Bayes classification technique, rooted in conditional probability, offers a hypothesis-driven approach to linking two random occurrences and calculating posterior probabilities. Substance addiction remains a critical issue, particularly in patients hospitalized in community mental health centers, necessitating effective predictive methods for early identification and intervention.
Methods
This study employed the Naive Bayes algorithm to classify substance addiction tendencies in patients. To enhance prediction accuracy, feature selection was conducted using the Information Value (IV) method. Ten patient attributes were analyzed, including gender, education level, income status, and relationship status with family and environment, among others. Features with strong or medium predictive power were prioritized for the model.
Results
Four features—gender, education level, income status, and relationship status with family and environment—demonstrated strong or medium predictive power for substance abuse. The Naive Bayes algorithm revealed that males are approximately four times more likely than females to develop substance addiction. Patients with education levels ranging from primary to high school were more prone than those with college-level education or higher. Additionally, those under state protection exhibited a higher likelihood of substance abuse compared to other income statuses. Finally, individuals with poor or neutral relationships with family and their environment were more susceptible to addiction.
Conclusion
The Naive Bayes algorithm effectively classified substance addiction tendencies in hospitalized patients, emphasizing key predictive factors such as gender, education level, income status, and relational dynamics. These findings highlight the importance of targeted interventions tailored to at-risk populations, improving early detection and management strategies in community mental health settings.
2. Hall P, Dean J, Kabul IK, Silva J. An overview of machine learning with SAS® Enterprise
Miner™. Cary: SAS Institute Inc.; 2014.
3. Harrington P. Machine learning in action. 5th ed. Greenwich, CT: Manning; 2012.
4. Mitchell TM. Machine learning. Burr Ridge, IL: McGraw-Hill; 1997.
5. Daumé III H. A course in machine learning [Internet]. 2017. Available from: http://ciml.info/,
chapter 5, p. 69. Accessed 2017 Sep.
6. Schapire RE. COS 511: Theoretical Machine Learning [Internet]. 2008. Available from:
http://www.cs.princeton.edu/courses/archive/spr08/cos511/scribe_notes/0204.pdf. Accessed 2017 Mar 19.
7. Camastra F, Vinciarelli A. Machine learning for audio, image, and video analysis. In: Advanced
Information and Knowledge Processing. 2008. p. 83–9.
8. Manning CD, Raghavan P, Schütze H. Introduction to information retrieval. Cambridge:
Cambridge University Press; 2008.
9. Rish I, et al. An empirical study of the Naive Bayes classifier. IJCAI 2001 Workshop on Empirical
Methods in Artificial Intelligence; 2001 Aug 4–6; Seattle. p. 41–6.
10. Zhang H. The optimality of Naive Bayes. Proc 17th Int Florida Artif Intell Res Soc Conf; 2004
May 12–14; Menlo Park. p. 562–7.
11. Ksir C, Ksir O. Drugs, society, and human behavior. 9th ed. Boston: McGraw-Hill; 2002. ISBN:
978-0072319637.
12. Mosby’s Medical, Nursing, & Allied Health Dictionary. 6th ed. St. Louis: Mosby; 2002. ISBN:
978-0-323-01430-4.
13. Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities
and challenges. Brief Bioinform. 2018;19(6):1236–46. https://doi.org/10.1093/bib/bbx044.
14. Shatte ABR, Hutchinson DM, Teague SJ. Machine learning in mental health: a scoping review of
methods and applications. Psychol Med. 2019;49(9):1426–48.
https://doi.org/10.1017/S0033291719000151.
15. Han DH, Lee S, Seo DC. Using machine learning to predict opioid misuse among U.S. adolescents.
Prev Med. 2020;130:105886. https://doi.org/10.1016/j.ypmed.2019.105886
16. Alpaydin E. Introduction to machine learning. Cambridge: MIT Press; 2014. ISBN: 0262325756.
17. Ceci M. Naive Bayesian learning from structural data [dissertation]. Bari, Italy: Dipartimento di
Informatica, University of Bari; 2005.
18. Panda M, Patra MR. Network intrusion detection using Naive Bayes. Int J Comput Sci Netw
Secur. 2007;7(12):258–63.
19. Murty NM, Devi VS. Pattern recognition: an algorithmic approach. 2011. p. 86–102. ISBN: 978-
0857294944.
20. Gupta P. Naive Bayes in machine learning [Internet]. Towards Data Science; 2024. Available
from: https://towardsdatascience.com/naive-bayes-in-machine-learning-f49cc8f831b4. Accessed 2024 Aug 10.
21. Roman V. Machine learning introduction: a comprehensive guide [Internet]. Towards Data
Science; 2024. Available from:
https://towardsdatascience.com/machine-learning-introduction-a-comprehensive-guide-af6712cf68a3. Accessed 2024 Aug 10.
22. Understanding the mathematics behind Naive Bayes [Internet]. 2018. Available from:
https://shuzhanfan.github.io/2018/06/understanding-mathematics-behind-naive-bayes/. Accessed 2024 Jan 24.
23. Randy H, Musdar AI. Aplikasi prediksi kerusakan smartphone menggunakan metode Naive
Bayes dan Laplace Smoothing. J Tek Ind Syst Inf (JTRISTE). 2018;5(2):8–16.
24. Narayanan V, Arora I, Bhatia A. Fast and accurate sentiment classification using an enhanced
Naïve Bayes model. In: IDEAL 2013. Berlin Heidelberg: Springer; 2013. p. 8206.
25. Witten IH, Frank E, Hall MA. Data mining: practical machine learning tools and techniques. 4th
ed. Morgan Kaufmann; 2016.
26. Lin A. Variable reduction in SAS by using information value and weight of evidence. In: Proc
SUGI Conf; 2015.
27. Alsabhan AH, Singh K, Sharma A, Alam S, Pandey DD, Rahman SAS, et al. Landslide susceptibility
assessment in the Himalayan range based along Kasauli–Parwanoo road corridor using weight of
evidence, information value, and frequency ratio. J King Saud Univ Sci. 2022;34(2).
28. Xia Y, Yan S. Feature selection based on weight of evidence and information value. Int J Inf
Technol Decis Mak. 2015;14(4):769–94.
29. Kuhn M. Feature engineering and selection: a practical approach for predictive models. Springer; 2021
30. Agresti A. Statistical methods for the social sciences. 5th ed. Pearson; 2018.
31. Cover TM, Thomas JA. Elements of information theory. 2nd ed. Wiley-Interscience; 2006.
32. Siddiqi N. Credit risk scorecards: developing and implementing intelligent credit scoring. Wiley;
2005.
33. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc B (Methodol).
1996;58(1):267–88.
34. Cramér H. Mathematical methods of statistics. Princeton University Press; 1946.
35. Addiction Center. The differences in addiction between men and women [Internet]. Available
from: https://www.addictioncenter.com/addiction/differences-men-women/.
36. Lopez-Quintero C, de los Cobos JP, Hasin DS, Okuda M, Wang S, Grant BF, et al. Probability
and predictors of remission from lifetime nicotine, alcohol, cannabis or cocaine dependence:
results from the National Epidemiologic Survey on Alcohol and Related Conditions. Addiction.
2015;106(3):657–69.
37. Miller DP, Chang J. Parental substance use and child health outcomes: a look at health care
utilization for Medicaid-insured children. Med Care Res Rev. 2019;76(2):267–86.
https://doi.org/10.1177/1077558717722590.
38. Taylor OD. Adolescent depression as a contributing factor to the development of substance use
disorders. J Hum Behav Soc Environ. 2017;27(7):715–22.
https://doi.org/10.1080/10911359.2017.1339652
| Files | ||
| Issue | Vol 11 No 2 (2025) | |
| Section | Articles | |
| DOI | https://doi.org/10.18502/jbe.v11i2.20558 | |
| Keywords | ||
| machine learning Naive Bayes algorithm information value classification substance abuse | ||
| Rights and permissions | |
|
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |

