Balanced Underbagged Ensemble Approach for Classifying Highly Imbalanced Datasets in the Insurance and Financial Sectors

dc.contributor.authorGutierrez-Gallego, Alberto
dc.contributor.authorGarnica Alcázar, Antonio Óscar
dc.contributor.authorJose M. Velasco, Jose M.
dc.contributor.authorParra, Daniel
dc.date.accessioned2026-02-23T17:12:20Z
dc.date.available2026-02-23T17:12:20Z
dc.date.issued2025
dc.description.abstractData bias is a critical challenge in machine learning applications within the financial and insurance sectors, as it can lead to misleading risk assessments and inaccurate predictive models. A prevalent source of bias in real‐world datasets is the imbalanced distribution of classes, which is particularly problematic in fraud detection, credit risk assessment, and claim prediction. Traditional approaches to handling imbalanced data often rely on undersampling or oversampling techniques. However, these methods may generate unrealistic minority class samples or fail to perform effectively when dealing with extreme class imbalances. In this paper, we propose a configurable technique based on the underbagging method, integrated with a classifier for highly imbalanced datasets. Our approach is designed to enhance the predictive accuracy of the minority class while maintaining robust performance for the majority class. We incorporate our methodology into a classification ensemble framework and evaluate its effectiveness by comparing it against 100 combinations of 10 different oversampling and undersampling techniques applied to 10 different machine learning algorithms. The evaluation is conducted on two highly imbalanced real‐world datasets: one related to auto insurance claims and another focused on credit card fraud detection. Our statistical analysis demonstrates that Balanced Underbagged Ensemble achieves superior classification performance in terms of recall for both classes, regardless of the base machine learning model used within the ensemble. Furthermore, our method finds an optimal balance between classification performance and computational efficiency.
dc.description.departmentDepto. de Arquitectura de Computadores y Automática
dc.description.facultyFac. de Informática
dc.description.refereedTRUE
dc.description.statuspub
dc.identifier.doi10.1002/isaf.70018
dc.identifier.urihttps://hdl.handle.net/20.500.14352/132943
dc.issue.number4
dc.journal.titleIntelligent Systems in Accounting, Finance and Management
dc.language.isoeng
dc.publisherWiley
dc.rights.accessRightsmetadata only access
dc.subject.ucmInformática (Informática)
dc.subject.unesco33 Ciencias Tecnológicas
dc.titleBalanced Underbagged Ensemble Approach for Classifying Highly Imbalanced Datasets in the Insurance and Financial Sectors
dc.typejournal article
dc.type.hasVersionAM
dc.volume.number32
dspace.entity.typePublication
relation.isAuthorOfPublication33d1dfc8-7bd7-4f4d-ac77-e9c369e8d82e
relation.isAuthorOfPublication.latestForDiscovery33d1dfc8-7bd7-4f4d-ac77-e9c369e8d82e

Download

Collections