Person:
Espínola Vílchez, María Rosario

Loading...
Profile Picture
First Name
María Rosario
Last Name
Espínola Vílchez
Affiliation
Universidad Complutense de Madrid
Faculty / Institute
Estudios estadísticos
Department
Estadística y Ciencia de los Datos
Area
Estadística e Investigación Operativa
Identifiers
UCM identifierORCIDScopus Author IDWeb of Science ResearcherIDDialnet ID

Search Results

Now showing 1 - 2 of 2
  • Item
    On measuring features importance in Machine Learning models in a two-dimensional representation scenario
    (2022) Gutiérrez García-Pardo, Inmaculada; Santos, Daniel; Castro Cantalejo, Javier; Gómez González, Daniel; Espínola Vílchez, María Rosario; Guevara Gil, Juan Antonio
    Abstract: There is a wide range of papers in the literature about the explanation of machine learning models in which Shapley value is considered to measure the importance of the features in these models. We can distinguish between these which set their basis on the cooperative game theory principles, and these focused on fuzzy measures. It is important to mention that all of these approaches only provide a crisp value (or a fix point) to measure the importance of a feature in a specific model. The reason is that an aggregation process of the different marginal contributions produces a single output for each variable. Nevertheless, and because of the relations between features, we cannot distinguish the case in which we do not know all the features. To this aim, we propose a disaggregated model which allows the analysis of the importance of the features, regarding the available information. This new proposal can be viewed as a generalization of all previous measures found in literature providing a two dimensional graph which, in a very intuitive and visual way, provides this rich disaggregated information. This information may be aggregated with several aggregation functions with which obtain new measures to establish the importance of the features. Specifically, the aggregation by the sum results in the Shapley value. We also explain the characteristics of those graphics in different scenarios of the relations among features, to raise this useful information at a glance.
  • Item
    Explanation of machine learning classification models with fuzzy measures: an approach to individual classification
    (2022) Santos, Daniel; Gutiérrez García-Pardo, Inmaculada; Castro Cantalejo, Javier; Gómez González, Daniel; Guevara Gil, Juan Antonio; Espínola Vílchez, María Rosario; Kahraman, Cengiz; Tolga, A. Cagri; Onar, Sezi Cevik; Cebi, Selcuk; Oztaysi, Basar; Sari, Irem Ucal
    Abstract: In the field of Machine Learning, there is a common point in almost all methodologies about measuring the importance of features in a model: estimating the value of a collection of them in several situations where different information sources (features) are available. To establish the value of the response feature, these techniques need to know the predictive ability of some features over others. We can distinguish two ways of performing this allocation. The first does not pay attention to the available information of known characteristics, assigning a random allocation value. The other option is to assume that the feasible values for the unknown features have to be any of the values observed in the sample (in the known part of the database), assuming that the values of the known features are correct. Despite its interest, there is a serious problem of overfitting in this approach, in situations in which there is a continuous feature: the values of a continuous feature are not likely to occur in any other, so there is a large loss of randomization (there will surely be an insignificant number of records for each possible value). In this scenario, it is probably unrealistic to assume a perfect estimation. Then, in this paper we propose a new methodology based on fuzzy measures which allows the analysis and consideration of the available information in known features, avoiding the problem of overfitting in the presence of continuous features. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG