Sumit S. Lad

Work place: Dept. of CSE, Rajarambapu Institute of Technology, Rajaramnagar, Sangli, Maharashtra, India

E-mail: sumitx86@gmail.com

Website:

Research Interests: Machine Learning,

Biography

Sumit S. Lad, pursuing the Master of Technology in Computer Science and Engineering, from Rajarambapu Institute of Technology, Rajaramnagar, Sangli, MS, India. Completed B.E. from Shivaji University, Kolhapur, MS, India. His areas of interest are cloud computing, machine learning and classification problems.

Author Articles
Improved Deep Learning Model for Static PE Files Malware Detection and Classification

By Sumit S. Lad Amol C. Adamuthe

DOI: https://doi.org/10.5815/ijcnis.2022.02.02, Pub. Date: 8 Apr. 2022

Static analysis and detection of malware is a crucial phase for handling security threats. Most researchers stated that the problem with the static analysis is an imbalance in the dataset, causing invalid result metrics. It requires more time for extracting features from the raw binaries, and methods like neural networks require more time for the training. Considering these problems, we proposed a model capable of building a feature set from the dataset and classifying static PE files efficiently.  The research work was conducted to emphasize the importance of feature extraction rather than focusing on model building. The well-extracted features help to provide better results when fed to neural networks with minimal numbers of layers. Using minimum layers will enhance the performance of the model and take fewer resources and time for the processing and evaluation. In this research work, EMBER datasets published by Endgame Inc. containing PE file information are used. Feature extraction, data standardization, and data cleaning techniques are performed to handle the imbalance and impurities from the dataset. Later the extracted features were scaled into a standard form to avoid the problems related to range variations. A total of 2381 features are extracted and pre-processed from both the 2017 and 2018 datasets, respectively. 

The pre-processed data is then given to a deep learning model for training. The deep learning model created using dense and dropout layers to minimize the resource strain on the model and deliver more accurate results in less amount of time. The results obtained during experimentation for EMBER v2017 and v2018 datasets are 97.53% and 94.09%, respectively. The model is trained for ten epochs with a learning rate of 0.01, and it took 4 minutes/epoch, which is one minute lesser than the Decision Tree model. In terms of precision metrics, our model achieved 98.85%, which is 1.85% more as compared to the existing models. 

[...] Read more.
Malware Classification with Improved Convolutional Neural Network Model

By Sumit S. Lad Amol C. Adamuthe

DOI: https://doi.org/10.5815/ijcnis.2020.06.03, Pub. Date: 8 Dec. 2020

Malware is a threat to people in the cyber world. It steals personal information and harms computer systems. Various developers and information security specialists around the globe continuously work on strategies for detecting malware. From the last few years, machine learning has been investigated by many researchers for malware classification. The existing solutions require more computing resources and are not efficient for datasets with large numbers of samples. Using existing feature extractors for extracting features of images consumes more resources. This paper presents a Convolutional Neural Network model with pre-processing and augmentation techniques for the classification of malware gray-scale images. An investigation is conducted on the Malimg dataset, which contains 9339 gray-scale images. The dataset created from binaries of malware belongs to 25 different families. To create a precise approach and considering the success of deep learning techniques for the classification of raising the volume of newly created malware, we proposed CNN and Hybrid CNN+SVM model. The CNN is used as an automatic feature extractor that uses less resource and time as compared to the existing methods. Proposed CNN model shows (98.03%) accuracy which is better than other existing CNN models namely VGG16 (96.96%), ResNet50 (97.11%) InceptionV3 (97.22%), Xception (97.56%). The execution time of the proposed CNN model is significantly reduced than other existing CNN models. The proposed CNN model is hybridized with a support vector machine. Instead of using Softmax as activation function, SVM performs the task of classifying the malware based on features extracted by the CNN model. The proposed fine-tuned model of CNN produces a well-selected features vector of 256 Neurons with the FC layer, which is input to SVM. Linear SVC kernel transforms the binary SVM classifier into multi-class SVM, which classifies the malware samples using the one-against-one method and delivers the accuracy of 99.59%.

[...] Read more.
Other Articles