Adnan Amin
Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study
Amin, Adnan; Anwar, Sajid; Adnan, Awais; Nawaz, Muhammad; Howard, Newton; Qadir, Junaid; Hawalah, Ahmad; Hussain, Amir
Authors
Sajid Anwar
Awais Adnan
Muhammad Nawaz
Newton Howard
Junaid Qadir
Ahmad Hawalah
Prof Amir Hussain A.Hussain@napier.ac.uk
Professor
Abstract
Customer retention is a major issue for various service-based organizations particularly telecom industry, wherein predictive models for observing the behavior of customers are one of the great instruments in customer retention process and inferring the future behavior of the customers. However, the performances of predictive models are greatly affected when the real-world data set is highly imbalanced. A data set is called imbalanced if the samples size from one class is very much smaller or larger than the other classes. The most commonly used technique is over/under sampling for handling the class-imbalance problem (CIP) in various domains. In this paper, we survey six well-known sampling techniques and compare the performances of these key techniques, i.e., mega-trend diffusion function (MTDF), synthetic minority oversampling technique, adaptive synthetic sampling approach, couples top-N reverse k-nearest neighbor, majority weighted minority oversampling technique, and immune centroids oversampling technique. Moreover, this paper also reveals the evaluation of four rules-generation algorithms (the learning from example module, version 2 (LEM2), covering, exhaustive, and genetic algorithms) using publicly available data sets. The empirical results demonstrate that the overall predictive performance of MTDF and rules-generation based on genetic algorithms performed the best as compared with the rest of the evaluated oversampling methods and rule-generation algorithms.
Citation
Amin, A., Anwar, S., Adnan, A., Nawaz, M., Howard, N., Qadir, J., Hawalah, A., & Hussain, A. (2016). Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study. IEEE Access, 4, 7940-7957. https://doi.org/10.1109/ACCESS.2016.2619719
Journal Article Type | Article |
---|---|
Acceptance Date | Oct 1, 2016 |
Online Publication Date | Oct 26, 2016 |
Publication Date | 2016 |
Deposit Date | Oct 7, 2019 |
Publicly Available Date | Oct 7, 2019 |
Journal | IEEE Access |
Electronic ISSN | 2169-3536 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 4 |
Pages | 7940-7957 |
DOI | https://doi.org/10.1109/ACCESS.2016.2619719 |
Keywords | SMOTE, ADASYN, mega trend diffusion function, class imbalance, rough set, customer churn, mRMR. ICOTE, MWMOTE, TRkNN |
Public URL | http://researchrepository.napier.ac.uk/Output/1792667 |
Files
Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study
(8.3 Mb)
PDF
Copyright Statement
c. 2016 IEEE. Authors, their employers and/or their funding agencies shall have the right to post the final, published version of IEEE copyrighted articles on their own personal servers or the servers of their institutions or employers without permission from IEEE, provided that the posted version includes a prominently displayed IEEE copyright notice and, when published, a full citation to the original IEEE publication, including the article’s Digital Object Identifier (DOI).
You might also like
MA-Net: Resource-efficient multi-attentional network for end-to-end speech enhancement
(2024)
Journal Article
Artificial intelligence enabled smart mask for speech recognition for future hearing devices
(2024)
Journal Article
Are Foundation Models the Next-Generation Social Media Content Moderators?
(2024)
Journal Article