An E-mail Spam Detection using Stacking and Voting Classification Methodologies

Full Text (PDF, 342KB), PP.27-36

Views: 0 Downloads: 0

Author(s)

Aasha Singh 1,* Awadhesh Kumar 1 Ajay Kumar Bharti 2 Vaishali Singh 3

1. KNIT, Sultanpur, India

2. BBDU, Lucknow, India

3. MUIT, Lucknow, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijieeb.2022.06.03

Received: 30 Jun. 2022 / Revised: 5 Aug. 2022 / Accepted: 20 Sep. 2022 / Published: 8 Dec. 2022

Index Terms

Email, Spam, SVM, Linear Regression, Stacking, Voting.

Abstract

Nowadays, we use emails almost in every field; there is not a single day, hour, or minute when emails are not used by people worldwide. Emails can be categorized into two types: ham and spam. Hams are useful emails, while spam is junk or unwanted emails. Spam emails may carry some unwanted, harmful information or viruses with them, which might harm user privacy. Spam mails are used to harm people by wasting their time and energy and stealing valuable information. Due to increasing in spam emails rapidly, spam detection and filtering are the prominent problems that need to be solved. This paper discusses various machine learning models like Naïve Bayes, Support Vector Machine, Decision Tree, Extra Decision Tree, Linear regression., and surveys about these machine learning techniques for email spam detection in terms of their accuracy and precision. In this paper, a comprehensive comparison of these techniques and stacking of different algorithms is also made based on their speed, accuracy, and precision performance.

Cite This Paper

Aasha Singh, Awadhesh Kumar, Ajay Kumar Bharti, Vaishali Singh, "An E-mail Spam Detection using Stacking and Voting Classification Methodologies", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.14, No.6, pp. 27-36, 2022. DOI:10.5815/ijieeb.2022.06.03

Reference

[1]M. H. Arif, J. Li, M. Iqbal, and K. Liu, “Sentiment analysis and spam detection in short informal text using learning classifier systems,” Soft Computing, vol. 22, no. 21, pp. 7281–7291, 2018.
[2]X. Zheng, X. Zhang, Y. Yu, T. Kechadi, and C. Rong, “ELM- based spammer detection in social networks,” The Journal of Supercomputing, vol. 72, no. 8, pp. 2991–3005, 2016.
[3]S. O. Olatunji, “Extreme Learning machines and Support Vector Machines models for email spam detection,” in Proceedings of the 2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE), IEEE, Windsor, Canada, April 2017.
[4]F. Jamil, H. K. Kahng, S. Kim, and D. H. Kim, “Towards secure fitness framework based on IoT-enabled blockchain network integrated with machine learning algorithms,” Sensors, vol. 21, no. 5, p. 1640, 2021.
[5]A. Subasi, S. Alzahrani, A. Aljuhani, and M. Aljedani, “Comparison of decision tree algorithms for spam E-mail filtering,” in Proceedings of the 2018 1st International Conference on Computer Applications & Information Security (ICCAIS), IEEE, Riyadh, Saudi Arabia, April 2018.
[6]M. Verma and S. Sofat, “Techniques to detect spammers in twitter-a survey,” International Journal of Computer Applications, vol. 85, no. 10, 2014.
[7]W. Hijawi, H. Faris, J. Alqatawna, A. Z. Ala’M, and I. Aljarah, “Improving email spam detection using content based feature engineering approach,” in Proceedings of the Applied Electrical Engineering and Computing Technologies (AEECT), IEEE, IEEE, Aqaba, Jordan, 2017.
[8]M. T. Banday and T. R. Jan, “Effectiveness and limitations of statistical spam filters,” 2009, https://arxiv.org/ftp/arxiv/ papers/0910/0910.2540.pdf.
[9]D DeBarr and H. Wechsler, “Using social network analysis for spam detection,” in Proceedings of the International Conference on Social Computing, Behavioral Modeling, and Prediction, Springer, Ethesda, MD, USA, March 2010.
[10]N. F. Rusland, N. Wahid, S. Kasim, and H. Hafit, “Analysis of Naıve Bayes algorithm for email spam filtering across multiple datasets,” in Proceedings of the IOP Conference Series: Materials Science and Engineering, IOP Publishing, Busan, Republic of Korea, 2017.
[11]H. Xu, W. Sun, and A. Javaid, “Efficient spam detection across online social networks,” in Proceedings of the 2016 IEEE In- ternational Conference on Big Data Analysis (ICBDA), IEEE, Hangzhou, China, March 2016.
[12]M. Zavvar, M. Rezaei, M. Rezaei, and S. Garavand, “Email spam detection using combination of particle swarm optimization and artificial neural network and support vector machine,” International Journal of Modern Education and Computer Science, vol. 8, no. 7, pp. 68–74, 2016.
[13]N. Udayakumar, S. Anandaselvi, and T. Subbulakshmi, “Dynamic malware analysis using machine learning India, algorithm,” in Proceedings of the 2017 International Conference on Intelligent Sustainable Systems (ICISS), IEEE, Palladam, December 2017.
[14]Emmanuel Gbenga Dada, Joseph Stephen Bassi, Haruna Chiroma, Shafi'i Muhammad Abdulhamid, Adebayo Olusola Adetunmbi, Opeyemi Emmanuel Ajibuwa, “Machine learning for email spam filtering: review, approaches and open research problems, Heliyon, Volume 5, Issue 6, 2019, e01802, ISSN 2405-8440,
[15]Naeem Ahmed, Rashid Amin, hamza Aldabbas, Deepika Koundal, Bader Alouffi and Tariq Shah, “Machine Learning Techniques for Spam Detection in Email and IoT platforms, Security and Communication Networks Volume 2022, Article ID 1862888
[16]Stefano Palminteri, Mathias Pessiglione, in International Review of Neurobiology, 2013.
[17]E. Alpaydin, Introduction to Machine Learning, MIT Press, Cambridge, UK, 2020.
[18]Z. Ghahrami, “Unsupervised Machine Learning Springer, Berlin, Germany
[19]S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine learning: a review of classification techniques, “Emerging artificial intelligence applications in computer engineering, vol. 160, pp. 3–24, 2007
[20]Article on Feature Extraction Techniques by Pier Paolo Ippolito published in Towards Data Science, October10, 2019
[21]J. Dean, “Large scale deep learning,” in Proceedings of the Keynote GPU Technical Conference, San Jose, CA, USA, 2015.
[22]Mohammad Zavvar, Meysam Rezaei, Shole Garavand,"Email Spam Detection Using Combination of Particle Swarm Optimization and Artificial Neural Network and Support Vector Machine", International Journal of Modern Education and Computer Science, Vol.8, No.7, pp.68-74, 2016.
[23]Shafi’i Muhammad Abdulhamid, Maryam Shuaib, Oluwafemi Osho, Idris Ismaila, John K. Alhassan,"Comparative Analysis of Classification Algorithms for Email Spam Detection", International Journal of Computer Network and Information Security, Vol.10, No.1, pp.60-67, 2018.