Simon Davies S.Davies@napier.ac.uk
Visiting Fellow
Differential Area Analysis for Ransomware Attack Detection within Mixed File Datasets
Davies, Simon R; Macfarlane, Richard; Buchanan, William J
Authors
Rich Macfarlane R.Macfarlane@napier.ac.uk
Associate Professor
Prof Bill Buchanan B.Buchanan@napier.ac.uk
Professor
Abstract
The threat from ransomware continues to grow both in the number of affected victims as well as the cost incurred by the people and organisations impacted in a successful attack. In the majority of cases, once a victim has been attacked there remain only two courses of action open to them; either pay the ransom or lose their data. One common behaviour shared between all crypto ransomware strains is that at some point during their execution they will attempt to encrypt the users' files. This paper demonstrates a technique that can identify when these encrypted files are being generated and is independent of the strain of the ransomware. An enhanced mixed file ransomware data set of more than 130,000 files was developed based on the govdocs corpus. This data set was enriched to contain examples of files that reflect the more modern Microsoft file formats, as well as examples of high entropy file formats such as compressed files and archives. The data set also contained eight different sets of files that were generated as the result of different real-world high profile ransomware attacks such as WannaCry, Ryuk, Phobos, Sodinokibi and NetWalker. Previous research has highlighted the difficulty in differentiating between compressed and encrypted files using Shannon entropy as both file types exhibit similar values. One of the experiments described in this paper shows a unique characteristic for the Shannon entropy of encrypted file header fragments. This characteristic was used to differentiate between encrypted files and other high entropy files such as archives. This discovery was leveraged in the development of a file classification model that used the differential area between the entropy curve of a file under analysis and one generated from random data. When comparing the entropy plot values of a file under analysis against one generated by a file containing purely random numbers, the greater the correlation of the plots is, the higher the confidence that the file under analysis contains encrypted data. The experiments demonstrate a high degree of confidence in the accuracy of the model achieving a success rate of more than 99.96% when examining only the first 192 bytes of a file, using a mixed data set of more than 80,000 files. This technique successfully addresses the problem of using file entropy to differentiate compressed and archived files from files encrypted by ransomware in a timely manner.
Citation
Davies, S. R., Macfarlane, R., & Buchanan, W. J. (2021). Differential Area Analysis for Ransomware Attack Detection within Mixed File Datasets. Computers and Security, 108, Article 102377. https://doi.org/10.1016/j.cose.2021.102377
Journal Article Type | Article |
---|---|
Acceptance Date | Jun 15, 2021 |
Online Publication Date | Jun 19, 2021 |
Publication Date | 2021-09 |
Deposit Date | Jun 25, 2021 |
Publicly Available Date | Jun 20, 2022 |
Print ISSN | 0167-4048 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 108 |
Article Number | 102377 |
DOI | https://doi.org/10.1016/j.cose.2021.102377 |
Keywords | Entropy, Ransomware Detection, Test Data Sets |
Public URL | http://researchrepository.napier.ac.uk/Output/2783076 |
Files
Differential Area Analysis For Ransomware Attack Detection Within Mixed File Datasets (accepted version)
(1.4 Mb)
PDF
Licence
http://creativecommons.org/licenses/by-nc-nd/4.0/
Copyright Statement
Accepted version licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.
You might also like
Comparison Of Common Mathematical Techniques Used In The Calculation Of File Entropy
(2022)
Presentation / Conference Contribution
Comparison of Entropy Calculation Methods for Ransomware Encrypted File Identification
(2022)
Journal Article
Review of Current Ransomware Detection Techniques
(2022)
Presentation / Conference Contribution
Exploring the Need For an Updated Mixed File Research Data Set
(2022)
Presentation / Conference Contribution
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search