Skip to main content

Research Repository

Advanced Search

Cluster-based oversampling with area extraction from representative points for class imbalance learning

Farou, Zakarya; Wang, Yizhi; Horváth, Tomáš


Zakarya Farou

Yizhi Wang

Tomáš Horváth


Class imbalance learning is challenging in various domains where training datasets exhibit disproportionate samples in a specific class. Resampling methods have been used to adjust the class distribution, but they often have limitations for small disjunct minority subsets. This paper introduces AROSS, an adaptive cluster-based oversampling approach that addresses these limitations. AROSS utilizes an optimized agglomerative clustering algorithm with the Cophenetic Correlation Coefficient and the Bayesian Information Criterion to identify representative areas of the minority class. Safe and half-safe areas are obtained using an incremental k-Nearest Neighbor strategy, and oversampling is performed with a truncated hyperspherical Gaussian distribution. Experimental evaluations on 70 binary datasets demonstrate the effectiveness of AROSS in improving class imbalance learning performance, making it a promising solution for mitigating class imbalance challenges, especially for small disjunct minority subsets.

Journal Article Type Article
Acceptance Date Mar 8, 2024
Online Publication Date Mar 16, 2024
Publication Date 2024-06
Deposit Date Mar 25, 2024
Publicly Available Date Mar 25, 2024
Journal Intelligent Systems with Applications
Publisher Elsevier
Peer Reviewed Peer Reviewed
Volume 22
Article Number 200357
Keywords Artificial Intelligence; Computer Science Applications; Computer Vision and Pattern Recognition; Signal Processing; Computer Science (miscellaneous)
Public URL


You might also like

Downloadable Citations