Performance Evaluation of Machine Learning-based Robocalls Detection Models in Telephony Networks

Full Text (PDF, 550KB), PP.37-53

Views: 0 Downloads: 0

Author(s)

Bodunde O. Akinyemi 1,* Oluwatoyin H. Odukoya 1 Mistura L. Sanni 1 Gilbert Sewagnon 1 Ganiyu A. Aderounmu 1

1. Department of Computer Science and Engineering, Obafemi Awolowo University, Ile-Ife, Nigeria

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2022.06.04

Received: 12 Feb. 2022 / Revised: 8 Jun. 2022 / Accepted: 27 Aug. 2022 / Published: 8 Dec. 2022

Index Terms

Spam, VoIP, Robocalls, SPIT, Machine Learning

Abstract

Many techniques have been proposed to detect and prevent spam over Internet telephony. Human spam calls can be detected more accurately with these techniques. However, robocalls, a type of voice spammer whose calling patterns are similar to those of legitimate users, cannot be detected as effectively. This paper proposes a model for robocall detection using a machine learning approach. Voice data recordings were collected and the relevant features for study were selected. The selected features were then used to formulate six (6) detection models. The formulated models were simulated and evaluated using some performance metrics to ascertain the model with the best performance. The C4.5 decision tree algorithm gave the best evaluation result with an accuracy of 99.15%, a sensitivity of 0.991%, a false alarm rate of 0.009%, and a precision of 0.992%. As a result, it was concluded that this approach can be used to detect and filter both machine-initiated and human-initiated spam calls.

Cite This Paper

Bodunde O. Akinyemi, Oluwatoyin H. Odukoya, Mistura L. Sanni, Gilbert Sewagnon, Ganiyu A. Aderounmu, "Performance Evaluation of Machine Learning-Based Robocalls Detection Models in Telephony Networks", International Journal of Computer Network and Information Security(IJCNIS), Vol.14, No.6, pp.37-53, 2022. DOI:10.5815/ijcnis.2022.06.04

Reference
[1]I.T. Javed, K. Toumi, F. Alharbi, T. Margaria, and N.Crespi, “Detecting Nuisance Calls over Internet Telephony Using Caller Reputation. Electronics, 10 (3), pp.353, 2021,  http://doi.org/10.3390/electronics10030353 
[2]M. Snider “Robocalls rang up a new high in 2019, two or more daily is average in some states”. Available at https://www.usatoday.com/story/tech/2020/01/15/robocalls-americans-got-58-5-billion-2019/4476018002, 2020.
[3]H. Tu, A. Doupe, Z. Zhao, and G.-J., Ahn “Sok: Everyone hates robocalls: A survey of techniques against telephone spam,” In the proceedings of the 2016 IEEE Symposium on Security and Privacy (SP), 9781509008247, pp. 320–338, 2016, http://dx.doi.org/10.1109/SP.2016.27
[4]H. Jarral, F. Mehmood, and A, Ali “Centralized Spam over Internet Telephony (SPIT) control on VoIP”. International Journal of Scientific and Research Publications, 7(2), pp 118-121, 2017.
[5]J. Pandit, R.P. Liu, and M. Ahamad, "Applying Deep Learning to Combat Mass Robocalls," in the proceeding of the 2021 IEEE Security and Privacy Workshops (SPW), 978166543732, pp. 63-70, 2021, http://dx.doi.org/10.1109/SPW53761.2021.00018
[6]T. Wadhwa. Why robocalls are about to get more dangerous. CNN Business, Available at: https://edition.cnn.com/2018/10/16/perspectives/robocalls-voice-manipulation-tech/index.html, 2018.
[7]I. Sherman, J. Bowers, K. McNamara Jr, J. Gilbert, J. Ruiz, and P. Traynor. “Are You Going to Answer That? Measuring User Responses to Anti-Robocall Application Indicators”. In the Proceedings of the 2020 Network and Distributed System Security Symposium, 1891562614, 2020, http://dx.doi.org/10.14722/ndss.2020.24286
[8]L. Junda, K. Naveen, and L. Shi, "Robocall and fake caller-id detection", Technical Disclosure Commons, (December 01, 2017) https://www.tdcommons.org/dpubs_series/845, 2017.
[9]I. N. Sherman, J. D. Bowers, L. -L. Laborde, J. E. Gilbert, J. Ruiz and P. G. Traynor, "Truly Visual Caller ID? An Analysis of Anti-Robocall Applications and their Accessibility to Visually Impaired Users," 2020 IEEE International Symposium on Technology and Society (ISTAS), 9781665415071, pp. 266-279, , 2020, http://dx.doi.org/10.1109/ISTAS50296.2020.9462185
[10]J. Xing, M. Yu, S. Wang, Y. Zhang, and Y. Ding, “Automated Fraudulent Phone Call Recognition through Deep Learning”. Wireless Communications and Mobile Computing, 2020, http://dx.doi.org/10.1155/2020/8853468
[11]B.O. Akinyemi, A.O. Amoo and E.A. Olajubu (2014), “An Adaptive Decision-Support Model for Data Communication Network Security Risk Management”. International Journal of Computer Applications, 106(8), pp.1-7, http://dx.doi.org/10.5120/18537-9752 
[12]O.H. Odukoya, B.O. Akinyemi, M. Fofana, and G.A. Aderounmu, “Performance evaluation of user-behaviour techniques of web spam detection models.  Network and Complex Systems (NCS), vol. 10, pp. 59-73, 2019, , http://dx.doi.org/10.7176/NCS/10-07  
[13]Y. Zhang, H. Wu, J. Zhang, J. Wang and X. Zou, "TW-FCM: An Improved Fuzzy-C-Means Algorithm for SPIT Detection," in proceedings of the 27th International Conference on Computer Communication and Networks (ICCCN), 9781538651568, pp. 1-9, 2018,  http://dx.doi.org/10.1109/ICCCN.2018.8487369
[14]S. Prasad, E. Bouma-Sims, A. K. Mylappan, and B. Reaves, “Who’s Calling? Characterizing Robocalls through Audio and Metadata Analysis.” In the Proceedings of the 29th USENIX Security Symposium. August 12–14, pp. 397-414, 2020. 
[15]K. B. Kealy, and P. I. Rosencrantz, “Arrangement for managing voice over IP (VoIP) telephone calls, especially unsolicited or unwanted calls”.  U.S. Patent No. 7,912,192. Washington, DC: U.S. Patent and Trademark Office, 2011.
[16]G. Vennila, M.S.K Manikandan., and M.N. Suresh, “Detection and prevention of spam over Internet telephony in Voice over Internet Protocol networks using Markov chain with incremental SVM”. International Journal of Communication Systems, 30(11), pp. e3255, 2017, http://dx.doi.org/10.1002/dac.3255
[17]K. Rieck, S. Wahl, P. Laskov, P. Domschitz, and K.R. Müller, “A self-learning system for detection of anomalous sip messages”. In: Schulzrinne H., State R., Niccolini S. (eds),  principles, systems and applications of IP telecommunications, services and security for next generation networks, Lecture Notes in Computer Science, 5310, 2008, http://dx.doi.org/10.1007/978-3-540-89054-6_5
[18]D. Hoffstadt, E. Rathgeb, M. Liebig, R. Meister, Y. Rebahi and T. Q Thanh, "A comprehensive framework for detecting and preventing VoIP fraud and misuse," in proceedings of the 2014 International Conference on Computing, Networking and Communications (ICNC), 9781479923588, pp. 807-813, 2014,  http://dx.doi.org/10.1109/ICCNC.2014.6785441
[19]G. Vennila, and M. S. K. Manikandan “Detection of Human and Computer Voice Spammers Using Hidden Markov Model in Voice Over Internet Protocol Network”. Procedia computer science, 115, pp. 588-595, 2017,  http://dx.doi.org/10.1016/j.procs.2017.09.169
[20]R. J. B Chikha, T. Abbes, W. B. Chikha, and A. Bouhoula, “Behavior-based approach to detect spam over IP telephony attacks”. International Journal of Information Security, 15(2), pp. 131-143, 2016, http://dx.doi.org/10.1007/s10207-015-0281-1 
[21]A. Natarajan, A. Kannan, V. Belagali, V. N. Pai, R. Shettar, and P. Ghuli, “Spam Detection Over Call Transcript Using Deep Learning,” in Proceedings of the Future Technologies Conference (FTC) 2021, Volume 2 (K. Arai, ed.), (Cham), pp. 138–150, Springer International Publishing, 2022, http://dx.doi.org/10.1007/978-3-030-89880-9_10
[22]M. Ghosh and P. Prabu, “Empirical analysis of ensemble methods for the classification of robocalls in telecommunications” International Journal of Electrical and Computer Engineering (IJECE), 9(4), pp. 3108~3114, 2019,  http://doi.org/10.11591/ijece.v9i4.pp3108-3114 
[23] S. Harun, T. H. Bhuiyan, S. Zhang, H. Medal and L. Bian, “Bot Classification for Real-Life Highly Class-Imbalanced Dataset," in proceedings of the 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress, pp. 565-572, 2017, http://dx.doi.org/10.1109/DASC-PICom-DataCom-CyberSciTec.2017.102
[24]A. Lieto, D. Moro, F. Devoti, C. Parera, V. Lipari, P. Bestagini, and S. Tubaro. "Hello? Who Am I Talking to?" A Shallow CNN Approach for Human vs. Bot Speech Classification”. In the proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 9781479981311,  2019, http://dx.doi.org/10.1109/ICASSP.2019.8682743
[25]B. Elizalde and D. Emmanouilidou, “Detection of Robocall and Spam Calls using Acoustic Features of Incoming Voicemails” in the Proceedings of Meetings of the 181st Meeting of the Acoustical Society of America,  29 November - 3 December 2021, Vol. 45, 060004 (2022) Seattle, Washington, http://dx.doi.org/10.1121/2.0001533
[26]M. G Hibbard, “Hanging up too early: remedies to reduce robocalls”. Journal of Law, Technology & the Internet, 5, pp. 79-112, 2014.  
[27]M. Mahoney, “Dialing Back: How Phone Companies Can End Unwanted Robocalls”, Technical report, Consumer Unions, Policy and action from Consumers reports. Available at: https://www.fcc.gov/consumers/guides/stop-unwanted-robocalls-and-texts , 2015.
[28]F. Staff. “Protecting consumer privacy in an era of rapid change–a proposed framework for businesses and policymakers”. Journal of Privacy and Confidentiality, 3(1), pp. 67-140, 2011, http://dx.doi.org/10.29012/jpc.v3i1.596
[29]H. Li, X. Xu, C. Liu, T. Ren, K. Wu, X. Cao,. ... and D. Song, "A Machine Learning Approach to Prevent Malicious Calls over Telephony Networks," 2018 IEEE Symposium on Security and Privacy (SP), pp. 53-69, 2018,  http://dx.doi.org/10.1109/SP.2018.00034
[30]S. M Gowri, G. S. Ramana, M. S. Ranjani and T. Tharani, “Detection of Telephony Spam and Scams using Recurrent Neural Network (RNN) Algorithm," in proceedings of the 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS), 9781665405201, pp. 1284-1288,  2021, http://dx.doi.org/10.1109/ICACCS51430.2021.9441982.
[31]Reena Sharma, Gurjot Kaur,"E-Mail Spam Detection Using SVM and RBF", International Journal of Modern Education and Computer Science, Vol.8, No.4, pp.57-63, 2016.