Zakarya Farou
Cluster-based oversampling with area extraction from representative points for class imbalance learning
Farou, Zakarya; Wang, Yizhi; Horváth, Tomáš
Authors
Yizhi Wang
Tomáš Horváth
Abstract
Class imbalance learning is challenging in various domains where training datasets exhibit disproportionate samples in a specific class. Resampling methods have been used to adjust the class distribution, but they often have limitations for small disjunct minority subsets. This paper introduces AROSS, an adaptive cluster-based oversampling approach that addresses these limitations. AROSS utilizes an optimized agglomerative clustering algorithm with the Cophenetic Correlation Coefficient and the Bayesian Information Criterion to identify representative areas of the minority class. Safe and half-safe areas are obtained using an incremental k-Nearest Neighbor strategy, and oversampling is performed with a truncated hyperspherical Gaussian distribution. Experimental evaluations on 70 binary datasets demonstrate the effectiveness of AROSS in improving class imbalance learning performance, making it a promising solution for mitigating class imbalance challenges, especially for small disjunct minority subsets.
Citation
Farou, Z., Wang, Y., & Horváth, T. (2024). Cluster-based oversampling with area extraction from representative points for class imbalance learning. Intelligent Systems with Applications, 22, Article 200357. https://doi.org/10.1016/j.iswa.2024.200357
Journal Article Type | Article |
---|---|
Acceptance Date | Mar 8, 2024 |
Online Publication Date | Mar 16, 2024 |
Publication Date | 2024-06 |
Deposit Date | Mar 25, 2024 |
Publicly Available Date | Mar 25, 2024 |
Journal | Intelligent Systems with Applications |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 22 |
Article Number | 200357 |
DOI | https://doi.org/10.1016/j.iswa.2024.200357 |
Keywords | Artificial Intelligence; Computer Science Applications; Computer Vision and Pattern Recognition; Signal Processing; Computer Science (miscellaneous) |
Public URL | http://researchrepository.napier.ac.uk/Output/3575166 |
Files
Cluster-based Oversampling With Area Extraction From Representative Points For Class Imbalance Learning
(4 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by-nc/4.0/
You might also like
A Comparative Study of Assessment Metrics for Imbalanced Learning
(2023)
Presentation / Conference Contribution
Squared Symmetric Formal Contexts and Their Connections with Correlation Matrices
(2023)
Presentation / Conference Contribution
NCC: Neural concept compression for multilingual document recommendation
(2023)
Presentation / Conference Contribution
Hyper-parameter initialization of classification algorithms using dynamic time warping: A perspective on PCA meta-features
(2022)
Presentation / Conference Contribution
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search