Skip to main content

Research Repository

Advanced Search

A framework for data cleaning in data warehouses

Peng, Taoxin

Authors



Abstract

It is a persistent challenge to achieve a high quality of data in data warehouses. Data cleaning is a crucial task for such a challenge. To deal with this challenge, a set of methods and tools has been developed. However, there are still at least two questions needed to be answered: How to improve the efficiency while performing data cleaning? How to improve the degree of automation when performing data cleaning? This paper challenges these two questions by presenting a novel framework, which provides an approach to managing data cleaning in data warehouses by focusing on the use of data quality dimensions, and decoupling a cleaning process into several sub-processes. Initial test run of the processes in the framework demonstrates that the approach presented is efficient and scalable for data cleaning in data warehouses.

Citation

Peng, T. (2008). A framework for data cleaning in data warehouses. Enterprise Information Systems, 473-478

Journal Article Type Article
Publication Date 2008
Deposit Date Feb 8, 2010
Publicly Available Date May 16, 2017
Print ISSN 1751-7575
Publisher Taylor & Francis
Peer Reviewed Peer Reviewed
Pages 473-478
Keywords data cleaning; data warehouse; performance efficiency; automation; data quallity; decoupling; scalable;
Public URL http://researchrepository.napier.ac.uk/id/eprint/3467

Files






You might also like



Downloadable Citations