Dr Taoxin Peng T.Peng@napier.ac.uk
Lecturer
A framework for data cleaning in data warehouses
Peng, Taoxin
Authors
Abstract
It is a persistent challenge to achieve a high quality of data in data warehouses. Data cleaning is a crucial task for such a challenge. To deal with this challenge, a set of methods and tools has been developed. However, there are still at least two questions needed to be answered: How to improve the efficiency while performing data cleaning? How to improve the degree of automation when performing data cleaning? This paper challenges these two questions by presenting a novel framework, which provides an approach to managing data cleaning in data warehouses by focusing on the use of data quality dimensions, and decoupling a cleaning process into several sub-processes. Initial test run of the processes in the framework demonstrates that the approach presented is efficient and scalable for data cleaning in data warehouses.
Citation
Peng, T. (2008). A framework for data cleaning in data warehouses. Enterprise Information Systems, 473-478
Journal Article Type | Article |
---|---|
Publication Date | 2008 |
Deposit Date | Feb 8, 2010 |
Publicly Available Date | May 16, 2017 |
Print ISSN | 1751-7575 |
Publisher | Taylor & Francis |
Peer Reviewed | Peer Reviewed |
Pages | 473-478 |
Keywords | data cleaning; data warehouse; performance efficiency; automation; data quallity; decoupling; scalable; |
Public URL | http://researchrepository.napier.ac.uk/id/eprint/3467 |
Files
A framework for data cleaning in data warehouses.pdf
(76 Kb)
PDF
You might also like
Feature selection Inspired classifier ensemble reduction.
(2014)
Journal Article
A comparison of techniques for name matching
(2012)
Journal Article
An evaluation of name matching techniques.
(2011)
Presentation / Conference Contribution
Improving data quality in data warehousing applications
(2010)
Presentation / Conference Contribution
A rule based taxonomy of dirty data.
(2011)
Presentation / Conference Contribution
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search