Skip to main content

Research Repository

Advanced Search

Towards a Synthetic Data Generator for Matching Decision Trees

Peng, Taoxin; Hanke, Florian

Authors

Florian Hanke



Abstract

It is popular to use real-world data to evaluate or teach data mining techniques. However, there are some disadvantages to use real-world data for such purposes. Firstly, real-world data in most domains is difficult to obtain for several reasons, such as budget, technical or ethical. Secondly, the use of many of the real-world data is restricted or in the case of data mining, those data sets do either not contain specific patterns that are easy to mine for teaching purposes or the data needs special preparation and the algorithm needs very specific settings in order to find patterns in it. The solution to this could be the generation of synthetic, “meaningful data” (data with intrinsic patterns). This paper presents a framework for such a data generator, which is able to generate datasets with intrinsic patterns, such as decision trees. A preliminary run of the prototype proves that the generation of such “meaningful data” is possible. Also the proposed approach could be extended to a further development for generating synthetic data with other intrinsic patterns

Presentation Conference Type Conference Paper (Published)
Conference Name 18th International Conference on Enterprise Information Systems
Start Date Apr 25, 2016
End Date Apr 28, 2016
Acceptance Date Feb 20, 2016
Online Publication Date Apr 25, 2016
Publication Date Apr 25, 2016
Deposit Date Dec 13, 2017
Publicly Available Date Dec 15, 2017
Publisher Scitepress Digital Library
Pages 135-141
Book Title Proceedings of the 18th International Conference on Enterprise Information Systems
Chapter Number 135-141
ISBN 978-989-758-187-8
DOI https://doi.org/10.5220/0005829001350141
Keywords Synthetic, Data Generator, Data Mining, Decision Trees, Classification, Pattern
Public URL http://researchrepository.napier.ac.uk/Output/947202
Publisher URL http://www.scitepress.org/DigitalLibrary
Contract Date Dec 13, 2017

Files









You might also like



Downloadable Citations