Comparing the Performance of Naive Bayes And Decision Tree Classification Using R

Full Text (PDF, 438KB), PP.11-19

Views: 0 Downloads: 0

Author(s)

Kirtika Yadav 1,* Reema Thareja 2

1. Delhi, India

2. Department of Computer Science, SPM College, University of Delhi

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2019.12.02

Received: 25 Mar. 2019 / Revised: 4 Jun. 2019 / Accepted: 12 Aug. 2019 / Published: 8 Dec. 2019

Index Terms

Decision tree classification, Data mining, R language, Supervised learning, Naive Bayes classification

Abstract

The use of technology is at its peak. Many companies try to reduce the work and get an efficient result in a specific amount of time. But a large amount of data is being processed each day that is being stored and turned into large datasets. To get useful information, the dataset needs to be analyzed so that one can extract knowledge by training the machine. Thus, it is important to analyze and extract knowledge from a large dataset. In this paper, we have used two popular classification techniques- Decision tree and Naive Bayes to compare the performance of the classification of our data set. We have taken student performance dataset that has 480 observations. We have classified these students into different groups and then calculated the accuracy of our classification by using the R language. Decision tree uses a divide and conquer method including some rules that makes it easy for humans to understand. The Naive Bayes theorem includes an assumption that the pair of features being classified are independent. It is based on the Bayes theorem.

Cite This Paper

Kirtika Yadav, Reema Thareja, "Comparing the Performance of Naive Bayes And Decision Tree Classification Using R", International Journal of Intelligent Systems and Applications(IJISA), Vol.11, No.12, pp.11-19, 2019. DOI:10.5815/ijisa.2019.12.02

Reference

[1]JiaweiHan, MichelineKamber, JianPe .Data Mining: Concepts and Techniques. 3rd edition.
[2]GalitShmueli, Nitin R. Patel, Peter C. Bruce. Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner.
[3]Himani Sharma, Sunil Kumar “A Survey on Decision Tree Algorithms of Classification in Data Mining”, in  International Journal of Science and Research (IJSR) 5(4), April 2016, ISSN (Online): 2319-7064.
[4]Venkata D RI.M, Lokanatha C. Reddy “A Comparative Study on Decision Tree Classification Algorithms in Data Mining”, International Journal of Computer Applications in Engineering, Technology and Sciences (IJ-CA-ETS), ISSN: 0974-3596, 2010.
[5]Linyuan Xia, Qiumei Huang, Dongjin Wu “Decision Tree-Based Contextual Location Prediction from Mobile Device Logs”, Hindawi Mobile Information Systems Volume 2018, Article ID 1852861, 1-11 pages, 2018.
[6]Galit Shmueli, Nitin R. Patel, Peter C. Bruce, “Classification and Regression Trees”, Data Mining for Business Intelligence lecture notes [ebook]. Page-108-109.
[7]Galit Shmueli, Nitin R. Patel, Peter C. Bruce. “Three Simple Classification Methods” Data Mining for Business Intelligence lecture notes Page-89-90.
[8]Efron B. Mathematics. “Bayes' theorem in the 21st century”. Science, Published by AAS Vol 340 June 2013; 340:1177-8. [Online]. Available: http://web.ipac.caltech.edu/staff/fmasci/home/astro_refs/Science-2013-Efron.pdf [Accessed Dec., 2018].
[9]Dalibor Bužić, Jasminka Dobša “Lyrics Classification using Naive Bayes“, in MIPRO 2018, 41st International Convention on Information and Communication Technology, Electronics and Microelectronics, At Opatija, Croatia, DOI: 10.23919/MIPRO, 2018.
[10]Galit Shmueli, Nitin R. Patel, Peter C. Bruce. Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner [ebook].
[11]S. Sankaranarayanan , T. Pramananda Perumal “Analysis of Naïve Bayes Classification for Diabetes Mellitus”, International Journal of Computer Sciences and Engineering(IJCSE), Vol.-6, Issue-12, Dec 2018, E-ISSN: 2347-2693
[12]https://gerardnico.com/data_mining/naive_bayes uc-r.github.io/naive_bayes
[13]Amrieh, E. A., Hamtini, T., & Aljarah I. “Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods”, International Journal of Database Theory and Application, 9(8), 119-136, 2016, DOI: 10.14257/ijdta.2016.9.8.13
[14]Amrieh, E. A., Hamtini, T., & Aljarah I. “Preprocessing and analyzing educational data set using X-API for improving student's performance”, in Applied Electrical Engineering and Computing Technologies (AEECT), IEEE Jordan Conference on (pp. 1-5). IEEE, 2015.
[15]Zhongheng Zhang “Naive Bayes classification in R” Ann Trans Med; 4(12):241Annals of Translational Medicine, Vol 4, 12 June 2016, DOI: 10.21037/atm.2016.03.38
[16]Ahmad Ashari, Iman Paryudi, A Min Tjoa “Performance Comparison between Naive Bayes, Decision Tree and k-Nearest Neighbor in Searching Alternative Design in an Energy Simulation Tool”, International Journal of Advanced Computer Science and Applications(IJACSA), Vol. 4, No. 11, 2013, DOI: 10.14569/IJACSA.2013.041105.
[17]D. Xhemali, C. J. Hinde, and R. G. Stone, “Naïve Bayes vs. Decision Trees vs. Neural Networks in the Classification of Training Web Page,” in International Journal of Computer Science Issue, Vol. 4(1), 2009.
[18]Prajwala T R “A Comparative Study on Decision Tree and Random Forest Using R Tool” in International Journal of Advanced Research in Computer and Communication Engineering Vol. 4, Issue 1, January 2015, DOI: 10.17148/IJARCCE.2015.4142.
[19]R. Entezari-Maleki, A. Rezaei, and B. Minaei-Bidgoli, “Comparison of Classification Methods Based on the Type of Attributes and Sample Size”, Journal of Convergence Information Technology (JCIT) 4(3):94-102•September 2009, DOI: 10.4156/jcit.vol4.issue3.14
[20]R. M. Rahman and F. Afroz, “Comparison of Various Classification Techniques Using Different Data Mining Tools for Diabetes Diagnosis”, Journal of Software Engineering and Applications, Vol. 6, 85 – 97, 2013, DOI: 10.4236/jsea.2013.63013.
[21]Z. Nematzadeh Balagatabi, “Comparison of Decision Tree and Naïve Bayes Methods in Classification of Researcher’s Cognitive Styles in Academic Environment”, Journal of Advances in Computer Research. Vol. 3(2), 23–34, 2012, DOI: 10.11113/jt.v74.1112
[22]https://en.wikipedia.org/wiki/Cohen%27s_kappa
[23]C. Anuradha1 and T. Velmurugan “Comparative Analysis on the Evaluation of Classification Algorithms in the Prediction of Students Performance”, Indian Journal of Science and Technology, Vol 8(15), 5-11, July 2015, DOI: 10.17485/ijst/2015/v8i15/74555
[24]D.SheelaJeyarani, G.Anushya, R.Rajarajeswari, A.Pethalakshmi “A Comparative Study of Decision Tree and Naive Bayesian Classifiers on Medical Datasets”, International Journal of Computer Applications (0975 – 8887) International Conference on Computing and information Technology (IC2IT), 5-7, 2013.
[25]S Rahmadani, A Dongoran, M Zarlis and Zakarias “Comparison of Naive Bayes and Decision Tree on Feature Selection Using Genetic Algorithm for Classification Problem” in 2nd International Conference on Computing and Applied Informatics 2017, IOP Conf. Series: Journal of Physics: Conf. Series 978 012087, 3-6, 2018.