Optimising Performance Curves for Ensemble Models through Pareto Front Analysis of the Decision Space

Loading...
Thumbnail Image

Official URL

Full text at PDC

Publication date

2025

Advisors (or tutors)

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Wiley
Citations
Google Scholar

Citation

Abstract

Receiver operating characteristic curves are commonly used to evaluate the performance of machine learning ensemble classification models that combine multiple classifiers through a voting procedure. Although these models have many parameters, standard ROC analyses typically vary only the voting threshold, limiting their potential for improvement. In this paper, we propose Performance Curve Mapping, a new method that redefines the ROC curve as the Pareto front of a multi‐objective optimisation problem. The method maps the multidimensional space of all ensemble parameters (Decision space) into a two‐dimensional Objective space defined by classification performance metrics. We employ an algorithm based on NSGA‐II to explore the Decision space and validate the proposal on two different classification problems: (1) predicting car insurance claims in a highly imbalanced dataset (Insurance dataset), and (2) predicting obesity risk in a balanced clinical dataset (GenObIA dataset). We compare our method with alternative ensemble optimisation approaches, using visual assessment, the area under the curve and the Youden index as performance measures. In the Insurance dataset, Performance Curve Mapping achieves an average improvement of 46.4% in AUC‐ROC and 26.1% in the Youden index. In the GenObIA dataset, it achieves an average improvement of 29.7% in AUC‐ROC and 11.9% in the Youden index. All improvements are calculated relative to the maximum achievable improvement.

Research Projects

Organizational Units

Journal Issue

Description

Keywords

Collections