Multi-modal Features Representation-based Convolutional Neural Network Model for Malicious Website Detection

Alsaedi, Mohammed; Ghaleb, Fuad A.; Saeed, Faisal; Ahmad, Jawad; Alasli, Mohammed

doi:10.1109/access.2023.3348071

Multi-modal Features Representation-based Convolutional Neural Network Model for Malicious Website Detection

Alsaedi, Mohammed; Ghaleb, Fuad A.; Saeed, Faisal; Ahmad, Jawad; Alasli, Mohammed

Authors

Mohammed Alsaedi

Fuad A. Ghaleb

Faisal Saeed

Dr Jawad Ahmad J.Ahmad@napier.ac.uk
Visiting Lecturer

Mohammed Alasli

Abstract

Web applications have proliferated across various business sectors, serving as essential tools for billions of users in their daily lives activities. However, many of these applications are malicious which is a major threat to Internet users as they can steal sensitive information, install malware, and propagate spam. Detecting malicious websites by analyzing web content is ineffective due to the complexity of extraction of the representative features, the huge data volume, the evolving nature of the malicious patterns, the stealthy nature of the attacks, and the limitations of traditional classifiers. Uniform Resource Locators (URL) features are static and can often provide immediate insights about the website without the need to load its content. However, existing solutions for detecting malicious web applications through web content analysis often struggle due to complex feature extraction, massive data volumes, evolving attack patterns, and limitations of traditional classifiers. Leveraging solely lexical URL features proves insufficient, potentially leading to inaccurate classifications. This study proposes a multimodal representation approach that fuses textual and image-based features to enhance the performance of the malicious website detection. Textual features facilitate the deep learning model’s ability to understand and represent detailed semantic information related to attack patterns, while image features are effective in recognizing more general malicious patterns. In doing so, patterns that are hidden in textual format may be recognizable in image format. Two Convolutional Neural Network (CNN) models were constructed to extract the hidden features from both textual and image-represented features. The output layers of both models were combined and used as input for an artificial neural network classifier for decision-making. Results show the effectiveness of the proposed model when compared to other models. The overall performance in terms of Matthews...

Citation

Alsaedi, M., Ghaleb, F. A., Saeed, F., Ahmad, J., & Alasli, M. (2024). Multi-modal Features Representation-based Convolutional Neural Network Model for Malicious Website Detection. IEEE Access, 12, 7271 - 7284. https://doi.org/10.1109/access.2023.3348071

Journal Article Type	Article
Acceptance Date	Dec 25, 2023
Online Publication Date	Dec 28, 2023
Publication Date	2024
Deposit Date	Jan 10, 2024
Publicly Available Date	Jan 10, 2024
Journal	IEEE Access
Electronic ISSN	2169-3536
Publisher	Institute of Electrical and Electronics Engineers (IEEE)
Peer Reviewed	Peer Reviewed
Volume	12
Pages	7271 - 7284
DOI	https://doi.org/10.1109/access.2023.3348071
Public URL	http://researchrepository.napier.ac.uk/Output/3440318