Skip to main content

Research Repository

Advanced Search

Application of evolutionary machine learning in metamorphic malware analysis and detection

Babaagba, Kehinde Oluwatoyin

Authors



Abstract

In recent times, malware detection and analysis are becoming key issues. A dangerous class of malware is metamorphic malware which is capable of modifying its own code and hiding malicious instructions within normal program code. Current malware detectors are susceptible to metamorphic malware as they are pre-trained to recognize only predicted versions of code. However, if detectors could be trained on a larger set of data that included potential mutant variants, they could be more accurate. The task of finding new evasive variants is challenging - many variants might exist.

In this research, a two-phase system is proposed. First, a mutation only Evolutionary Algorithm (EA) is used to search for a diverse set of new, malicious mutants, that evade detection by existing detection algorithms. While this is shown to be successful, it requires multiple runs of the algorithm to produce multiple variants without explicit guarantee of diversity. To address this, a Quality Diversity (QD) algorithm — MAP-Elites, that traverses a high-dimensional search space in search of the best solution at every point of a feature space with low dimension, is then developed to return a large and diverse repertoire of solutions in a single run. This method produces a larger and more diverse archive of solutions than the mutation only Evolutionary Algorithm (EA) and sheds insight into the properties of a sample that lead to them being undetectable by a suite of existing detection engines.

Having created a set of evasive and diverse variants, detectors are then trained using a set of classical classification methods (feature-based and sequence-based models) with results showing that classification of metamorphic malware can be improved by augmenting training data with the diverse set of evolved variant samples. This also includes the use of a pretrained Natural Language Processing (NLP) model in a transfer learning setting to show improved classification of metamorphic malware, using the evolved variants as part of the training data.

Citation

Babaagba, K. O. Application of evolutionary machine learning in metamorphic malware analysis and detection. (Thesis). Edinburgh Napier University. Retrieved from http://researchrepository.napier.ac.uk/Output/2801469

Thesis Type Thesis
Deposit Date Sep 13, 2021
Publicly Available Date Sep 13, 2021
DOI https://doi.org/10.17869/enu.2021.2801469
Public URL http://researchrepository.napier.ac.uk/Output/2801469
Award Date Jul 31, 2021

Files

Application of evolutionary machine learning in metamorphic malware analysis and detection (8.4 Mb)
PDF

Copyright Statement
Papers under publishers' copyright have been redacted from the appendices.




You might also like



Downloadable Citations