Fangwei Wang
MalSort: Lightweight and efficient image-based malware classification using masked self-supervised framework with Swin Transformer
Wang, Fangwei; Shi, Xipeng; Yang, Fang; Song, Ruixin; Li, Qingru; Tan, Zhiyuan; Wang, Changguang
Authors
Xipeng Shi
Fang Yang
Ruixin Song
Qingru Li
Dr Thomas Tan Z.Tan@napier.ac.uk
Associate Professor
Changguang Wang
Abstract
The proliferation of malware has exhibited a substantial surge in both quantity and diversity, posing significant threats to the Internet and indispensable network applications. The accurate and effective classification makes a pivotal role in defending against malware. Numerous approaches employ supervised learning techniques, specifically Convolutional Neural Networks (CNNs), to train feature extractors. However, acquiring a substantial quantity of labled samples incurs significant expenses, and relying solely on CNNs as feature extractors may result in restricted local receptive fields, consequently compromising the preservation of crucial features. In order to address these constraints, we propose an effective malware classification approach, denoted as MalSort, which leverages the masked self-supervised framework with Swin Transformer. Initially, each instance of malware is transformed into a color image. Furthermore, the Swin Transformer self-supervised framework is utilized to extract multi-scale key feature vectors from a randomly masked partial color image, while the prediction module is employed to predict the masked image. Ultimately, the pre-trained encoder is fine-tuned using the malware dataset to effectively carry out a malware classification task. Our MalSort exhibits a reduced reliance on labeled data samples during the training phase, thereby obviating the necessity for extensive amounts of labeled data. Consequently, the MalSort conserves hardware resources and improve its training efficiency. The experimental results indicate that the MalSort outperforms existing models by achieving a classification accuracy of 97.85%, a recall of 97.63%, a precision of 97.85%, and an F1-score of 97.85% on the BIG2015 dataset. Similarly, on the Malimg dataset, the model achieves percentages of 98.28%, 98.18%, 98.19%, and 98.28% for classification accuracy, recall, precision, and F1-score, respectively.
Citation
Wang, F., Shi, X., Yang, F., Song, R., Li, Q., Tan, Z., & Wang, C. (2024). MalSort: Lightweight and efficient image-based malware classification using masked self-supervised framework with Swin Transformer. Journal of Information Security and Applications, 83, Article 103784. https://doi.org/10.1016/j.jisa.2024.103784
Journal Article Type | Article |
---|---|
Acceptance Date | May 1, 2024 |
Online Publication Date | May 14, 2024 |
Publication Date | 2024-06 |
Deposit Date | May 15, 2024 |
Publicly Available Date | May 15, 2026 |
Electronic ISSN | 2214-2126 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 83 |
Article Number | 103784 |
DOI | https://doi.org/10.1016/j.jisa.2024.103784 |
Keywords | malware classification, deep learning, self-supervised learning, Swin Transformer, multi-scale key feature |
Public URL | http://researchrepository.napier.ac.uk/Output/3634273 |
Publisher URL | https://www.sciencedirect.com/journal/journal-of-information-security-and-applications |
Files
This file is under embargo until May 15, 2026 due to copyright reasons.
Contact repository@napier.ac.uk to request a copy for personal use.
You might also like
Machine Un-learning: An Overview of Techniques, Applications, and Future Directions
(2023)
Journal Article
A Digital Twin-Assisted Intelligent Partial Offloading Approach for Vehicular Edge Computing
(2023)
Journal Article
An omnidirectional approach to touch-based continuous authentication
(2023)
Journal Article
Special Issue on Adversarial AI to IoT Security and Privacy Protection: Attacks and Defenses
(2022)
Journal Article