Yinghai Zhou
CDTier:A Chinese Dataset of Threat Intelligence Entity Relationships
Zhou, Yinghai; Ren, Yitong; Yi, Ming; Xiao, Yanjun; Tan, Zhiyuan; Moustafa, Nour; Tian, Zhihong
Authors
Yitong Ren
Ming Yi
Yanjun Xiao
Dr Thomas Tan Z.Tan@napier.ac.uk
Associate Professor
Nour Moustafa
Zhihong Tian
Abstract
Cyber Threat Intelligence (CTI), which is knowledge of cyberspace threats gathered from security data, is critical in defending against cyberattacks.However, there is no open-source CTI dataset for security researchers to effectively apply enormous CTI information for security analysis in the field of threat intelligence, particularly in the field of Chinese threat intelligence. As a result, for network security research and development, this paper constructed a Chinese CTI entity relationship dataset–CDTier, which includes: 1) A threat entity extraction dataset composed of 100 CTI reports, 3744 threat sentences and 4259 threat knowledge objects; 2) A dataset for entity relation extraction including 100 CTI reports, 2598 threat sentences and 2562 knowledge object relations. CDTier is, as far as we know, the first CTI dataset. On the CDTier, we trained 4 models for threat entity extraction and relation extraction using well-established and widely used deep learning methods in the NLP. The results showed that the model trained on CDTier extracts knowledge objects and their relationships described in threat intelligence more accurately. This significantly minimizes threat intelligence analysts' work while assessing threat intelligence. The CDTier may be found at https://github.com/MuYu-z/CDTier .
Citation
Zhou, Y., Ren, Y., Yi, M., Xiao, Y., Tan, Z., Moustafa, N., & Tian, Z. (2023). CDTier:A Chinese Dataset of Threat Intelligence Entity Relationships. IEEE Transactions on Sustainable Computing, 8(4), 627-638. https://doi.org/10.1109/TSUSC.2023.3240411
Journal Article Type | Article |
---|---|
Acceptance Date | Jan 23, 2023 |
Online Publication Date | Jan 30, 2023 |
Publication Date | 2023 |
Deposit Date | Jan 31, 2023 |
Publicly Available Date | Jan 31, 2023 |
Electronic ISSN | 2377-3782 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 8 |
Issue | 4 |
Pages | 627-638 |
DOI | https://doi.org/10.1109/TSUSC.2023.3240411 |
Keywords | Cyber threat intelligence, entity relation extraction, information extraction, NLP, threat entity extraction |
Public URL | http://researchrepository.napier.ac.uk/Output/3014335 |
Files
CDTier: A Chinese Dataset Of Threat Intelligence Entity Relationships (accepted version)
(12.1 Mb)
PDF
You might also like
Machine Un-learning: An Overview of Techniques, Applications, and Future Directions
(2023)
Journal Article
A Digital Twin-Assisted Intelligent Partial Offloading Approach for Vehicular Edge Computing
(2023)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search