Optimising Performance Curves for Ensemble Models through Pareto Front Analysis of the Decision Space
Loading...
Official URL
Full text at PDC
Publication date
2025
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Wiley
Citation
Abstract
Receiver operating characteristic curves are commonly used to evaluate the performance of machine learning ensemble classification models that combine multiple classifiers through a voting procedure. Although these models have many parameters, standard ROC analyses typically vary only the voting threshold, limiting their potential for improvement. In this paper, we propose Performance Curve Mapping, a new method that redefines the ROC curve as the Pareto front of a multi‐objective optimisation problem. The method maps the multidimensional space of all ensemble parameters (Decision space) into a two‐dimensional Objective space defined by classification performance metrics. We employ an algorithm based on NSGA‐II to explore the Decision space and validate the proposal on two different classification problems: (1) predicting car insurance claims in a highly imbalanced dataset (Insurance dataset), and (2) predicting obesity risk in a balanced clinical dataset (GenObIA dataset). We compare our method with alternative ensemble optimisation approaches, using visual assessment, the area under the curve and the Youden index as performance measures. In the Insurance dataset, Performance Curve Mapping achieves an average improvement of 46.4% in AUC‐ROC and 26.1% in the Youden index. In the GenObIA dataset, it achieves an average improvement of 29.7% in AUC‐ROC and 11.9% in the Youden index. All improvements are calculated relative to the maximum achievable improvement.













