Detecting Android Malware by Mining Enhanced System Call Graphs

PDF (924KB), PP.28-41

Views: 0 Downloads: 0

Author(s)

Rajif Agung Yunmar 1,2,* Sri Suning Kusumawardani 1 Widyawan Widyawan 1 Fadi Mohsen 3

1. Dept. Electrical and Information Engineering, Universitas Gadjah Mada, Indonesia

2. Dept. Informatics Engineering, Institut Teknologi Sumatera, Indonesia

3. Dept. Computer Science, University of Groningen, 9712 CP Groningen, Netherland

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2024.02.03

Received: 16 Mar. 2023 / Revised: 31 May 2023 / Accepted: 25 Oct. 2023 / Published: 8 Apr. 2024

Index Terms

Heuristic-based Detection, Android, Malware, System Call, Graph, Machine Learning

Abstract

The persistent threat of malicious applications targeting Android devices has been growing in numbers and severity. Numerous techniques have been utilized to defend against this thread, including heuristic-based ones, which are able to detect unknown malware. Among the many features that this technique uses are system calls. Researchers have used several representation methods to capture system calls, such as histograms. However, some information may be lost if the system calls as a feature is only represented as a 1-dimensional vector. Graphs can represent the interaction of different system calls in an unusual or suspicious way, which can indicate malicious behavior. This study uses machine learning algorithms to recognize malicious behavior represented in a graph. The system call graph was fed into machine learning algorithms such as AdaBoost, Decision Table, Naïve Bayes, Random Forest, IBk, J48, and Logistic regression. We further employ a series feature selection method to improve detection accuracy and eliminate computational complexity. Our experiment results show that the proposed method has reduced feature dimension to 91.95% and provides 95.32% detection accuracy.

Cite This Paper

Rajif Agung Yunmar, Sri Suning Kusumawardani, Widyawan Widyawan, Fadi Mohsen, "Detecting Android Malware by Mining Enhanced System Call Graphs", International Journal of Computer Network and Information Security(IJCNIS), Vol.16, No.2, pp.28-41, 2024. DOI:10.5815/ijcnis.2024.02.03

Reference

[1]“Smartphone Market Share OS.” https://www.idc.com/promo/smartphone-market-share/os (accessed Mar. 15, 2020).
[2]A. Din, “Companies Affected by Ransomware [2022-2023],” 2023. https://heimdalsecurity.com/blog/companies-affected-by-ransomware/ (accessed May 12, 2023).
[3]D. Storm, “98% of Mobile Malware Targets Android Platform.” https://www.computerworld.com/article/2475964/98--of-mobile-malware-targets-android-platform.html (accessed Mar. 15, 2020).
[4]A. Firdaus, “Mobile Malware Anomaly-based Detection Systems using Static Analysis Features,” University of Malaya, 2017.
[5]Kaspersky, “What is Heuristic Analysis?,” 2019. https://usa.kaspersky.com/resource-center/definitions/heuristic-analysis (accessed Jan. 29, 2021).
[6]W. Wang et al., “Constructing Features for Detecting Android Malicious Applications: Issues, Taxonomy and Directions,” IEEE Access, vol. 7, pp. 67602–67631, 2019, doi: 10.1109/ACCESS.2019.2918139.
[7]V. Sihag, M. Vardhan, and P. Singh, “BLADE: Robust malware detection against obfuscation in android,” Forensic Sci. Int. Digit. Investig., vol. 38, p. 301176, 2021, doi: 10.1016/j.fsidi.2021.301176.
[8]P. Feng, J. Ma, C. Sun, X. Xu, and Y. Ma, “A novel dynamic android malware detection system with ensemble learning,” IEEE Access, vol. 6, pp. 30996–31011, 2018, doi: 10.1109/ACCESS.2018.2844349.
[9]K. Bakour, H. M. Ünver, and R. Ghanem, The Android malware detection systems between hope and reality, vol. 1, no. 9. Springer International Publishing, 2019.
[10]T. S. John, T. Thomas, and S. Emmanuel, “Graph Convolutional Networks for Android Malware Detection with System Call Graphs,” in 2020 Third ISEA Conference on Security and Privacy (ISEA-ISAP), Feb. 2020, pp. 162–170, doi: 10.1109/ISEA-ISAP49340.2020.235015.
[11]A. Jalilifard, V. F. Caridá, A. F. Mansano, R. S. Cristo, and F. P. C. da Fonseca, “Semantic Sensitive TF-IDF to Determine Word Relevance in Documents,” 2021, pp. 327–337.
[12]K. Deepa, G. Radhamani, P. Vinod, M. Shojafar, N. Kumar, and M. Conti, “Identification of Android malware using refined system calls,” Concurr. Comput. Pract. Exp., vol. 31, no. 20, pp. 1–24, 2019, doi: 10.1002/cpe.5311.
[13]N. T. Nguyen et al., “Malware Detection Using System Logs,” ICDAR 2020 - Proc. 2020 Intell. Cross-Data Anal. Retr. Work., pp. 9–14, 2020, doi: 10.1145/3379174.3392318.
[14]P. K. Das, A. Joshi, and T. Finin, “App behavioral analysis using system calls,” 2017 IEEE Conf. Comput. Commun. Work. INFOCOM WKSHPS 2017, pp. 487–492, 2017, doi: 10.1109/INFCOMW.2017.8116425.
[15]S. Chaba, R. Kumar, R. Pant, and M. Dave, “Malware Detection Approach for Android systems Using System Call Logs,” pp. 1–5, 2017, [Online]. Available: http://arxiv.org/abs/1709.08805.
[16]D. Jurafsky and James H. Martin, Speech and Language Processing. 2021.
[17]A. Ananya, A. Aswathy, T. R. Amal, P. G. Swathy, P. Vinod, and S. Mohammad, “SysDroid: a dynamic ML-based android malware analyzer using system call traces,” Cluster Comput., vol. 23, no. 4, pp. 2789–2808, 2020, doi: 10.1007/s10586-019-03045-6.
[18]X. Zhang et al., “An Early Detection of Android Malware Using System Calls based Machine Learning Model,” in Proceedings of the 17th International Conference on Availability, Reliability and Security, Aug. 2022, pp. 1–9, doi: 10.1145/3538969.3544413.
[19]C. Da, H. Zhang, and X. Zhang, “Detection of Android malware security on system calls,” Proc. 2016 IEEE Adv. Inf. Manag. Commun. Electron. Autom. Control Conf. IMCEC 2016, pp. 974–978, 2017, doi: 10.1109/IMCEC.2016.7867355.
[20]R. Surendran, T. Thomas, and S. Emmanuel, “A TAN based hybrid model for android malware detection,” J. Inf. Secur. Appl., vol. 54, p. 102483, 2020, doi: 10.1016/j.jisa.2020.102483.
[21]S. Garg and N. Baliyan, “A novel parallel classifier scheme for vulnerability detection in Android,” Comput. Electr. Eng., vol. 77, pp. 12–26, 2019, doi: 10.1016/j.compeleceng.2019.04.019.
[22]F. Tong and Z. Yan, “A hybrid approach of mobile malware detection in Android,” J. Parallel Distrib. Comput., vol. 103, pp. 22–31, 2017, doi: 10.1016/j.jpdc.2016.10.012.
[23]A. S. M. Ahsan-Ul-Haque, M. S. Hossain, and M. Atiquzzaman, “Sequencing System Calls for Effective Malware Detection in Android,” 2018 IEEE Glob. Commun. Conf. GLOBECOM 2018 - Proc., no. April 2019, 2018, doi: 10.1109/GLOCOM.2018.8647967.
[24]X. Xiao, Z. Wang, Q. Li, S. Xia, and Y. Jiang, “Back-propagation neural network on Markov chains from system call sequences: A new approach for detecting Android malware with system call sequences,” IET Inf. Secur., vol. 11, no. 1, pp. 8–15, 2017, doi: 10.1049/iet-ifs.2015.0211.
[25]T. DelSole, “A fundamental limitation of Markov models,” J. Atmos. Sci., vol. 57, no. 13, pp. 2158–2168, 2000, doi: 10.1175/1520-0469(2000)057<2158:AFLOMM>2.0.CO;2.
[26]X. Xiao, Z. Wang, Q. Li, Q. Li, and Y. Jiang, “ANNS on co-occurrence matrices for mobile malware detection,” KSII Trans. Internet Inf. Syst., vol. 9, no. 7, pp. 2736–2754, 2015, doi: 10.3837/tiis.2015.07.023.
[27]M. Borek, G. Creech, and U. Canberra, “Intrusion Detection System for Android: Linux kernel system calls analysis,” Aalto University, 2017.
[28]L. D. Thuan, H. Van Hiep, and N. K. Khanh, “Android Malware Classification Using Deep Learning CNN with Co-occurrence Matrix Feature ,” JST Smart Syst. Devices, vol. 31, no. 1, pp. 9–17, 2020, doi: 10.51316/jst.150.ssad.2021.31.1.2.
[29]X. Xiao, X. Xiao, Y. Jiang, X. Liu, and R. Ye, “Identifying Android malware with system call co-occurrence matrices,” Trans. Emerg. Telecommun. Technol., vol. 27, no. 5, pp. 675–684, May 2016, doi: 10.1002/ett.3016.
[30]C. Wang, Z. Li, X. Mo, H. Yang, and Y. Zhao, “An android malware dynamic detection method based on service call co-occurrence matrices,” Ann. des Telecommun. Telecommun., vol. 72, no. 9–10, pp. 607–615, 2017, doi: 10.1007/s12243-017-0580-9.
[31]L. Xu, D. Zhang, N. Jayasena, and J. Cavazos, “HADM: Hybrid Analysis for Detection of Malware,” in Proceedings of SAI Intelligent Systems Conference (IntelliSys) 2016, 2016, vol. 1, no. August, doi: 10.1007/978-3-319-56991-8.
[32]F. Mahan, M. Mohammadzad, S. M. Rozekhani, and W. Pedrycz, “Chi-MFlexDT:Chi-square-based multi flexible fuzzy decision tree for data stream classification,” Appl. Soft Comput., vol. 105, p. 107301, Jul. 2021, doi: 10.1016/j.asoc.2021.107301.
[33]V. Sihag, M. Vardhan, and P. Singh, “A survey of android application and malware hardening,” Comput. Sci. Rev., vol. 39, p. 100365, 2021, doi: 10.1016/j.cosrev.2021.100365.
[34]A. H. Lashkari, A. F. A. Kadir, L. Taheri, and A. A. Ghorbani, “Toward Developing a Systematic Approach to Generate Benchmark Android Malware Datasets and Classification,” Proc. - Int. Carnahan Conf. Secur. Technol., vol. 2018-Octob, no. Cic, pp. 1–7, 2018, doi: 10.1109/CCST.2018.8585560.
[35]Statcounter, “Android Version Market Share.” https://gs.statcounter.com/os-version-market-share/android (accessed Sep. 02, 2022).
[36]W. University, “Weka 3: Machine Learning Software in Java,” 2022. https://www.cs.waikato.ac.nz/ml/weka/ (accessed Dec. 23, 2022).
[37]M. D. Preda and F. Maggi, “Testing android malware detectors against code obfuscation: a systematization of knowledge and unified methodology,” J. Comput. Virol. Hacking Tech., vol. 13, no. 3, pp. 209–232, 2017, doi: 10.1007/s11416-016-0282-2.
[38]S. Khalid, F. B. Hussain, and M. Gohar, “Towards Obfuscation Resilient Feature Design for Android Malware Detection-KTSODroid,” Electronics, vol. 11, no. 24, p. 4079, Dec. 2022, doi: 10.3390/electronics11244079.