Aviso: para depositar documentos, por favor, inicia sesión e identifícate con tu cuenta de correo institucional de la UCM con el botón MI CUENTA UCM. No emplees la opción AUTENTICACIÓN CON CONTRASEÑA
 

A fair-multicluster approach to clustering of categorical data

dc.contributor.authorSantos Mangudo, Carlos
dc.contributor.authorHeras Martínez, Antonio José
dc.date.accessioned2023-06-22T12:29:58Z
dc.date.available2023-06-22T12:29:58Z
dc.date.issued2022-11-08
dc.descriptionCRUE-CSIC (Acuerdos Transformativos 2022)
dc.description.abstractIn the last few years, the need of preventing classification biases due to race, gender, social status, etc. has increased the interest in designing fair clustering algorithms. The main idea is to ensure that the output of a cluster algorithm is not biased towards or against specific subgroups of the population. There is a growing specialized literature on this topic, dealing with the problem of clustering numerical data bases. Nevertheless, to our knowledge, there are no previous papers devoted to the problem of fair clustering of pure categorical attributes. In this paper, we show that the Multicluster methodology proposed by Santos and Heras (Interdiscip J Inf Knowl Manag 15:227–246, 2020. https://doi.org/10.28945/4643) for clustering categorical data, can be modified in order to increase the fairness of the clusters. Of course, there is a tradeoff between fairness and efficiency, so that an increase in the fairness objective usually leads to a loss of classification efficiency. Yet it is possible to reach a reasonable compromise between these goals, since the methodology proposed by Santos and Heras (2020) can be easily adapted in order to get homogeneous and fair clusters.
dc.description.departmentDepto. de Economía Financiera y Actuarial y Estadística
dc.description.facultyFac. de Ciencias Económicas y Empresariales
dc.description.refereedTRUE
dc.description.statuspub
dc.eprint.idhttps://eprints.ucm.es/id/eprint/75675
dc.identifier.doi10.1007/s10100-022-00824-2
dc.identifier.issn1435-246X
dc.identifier.officialurlhttps://doi.org/10.1007/s10100-022-00824-2
dc.identifier.urihttps://hdl.handle.net/20.500.14352/72690
dc.journal.titleCentral European Journal of Operations Research
dc.language.isoeng
dc.publisherSpringer Nature
dc.rightsAtribución 3.0 España
dc.rights.accessRightsopen access
dc.rights.urihttps://creativecommons.org/licenses/by/3.0/es/
dc.subject.keywordClustering
dc.subject.keywordFairness
dc.subject.keywordFair clustering
dc.subject.keywordCategorical data
dc.subject.ucmEstadística
dc.subject.unesco1209 Estadística
dc.titleA fair-multicluster approach to clustering of categorical data
dc.typejournal article
dspace.entity.typePublication
relation.isAuthorOfPublication047e514e-3517-4265-8bb2-dba00d57d167
relation.isAuthorOfPublication.latestForDiscovery047e514e-3517-4265-8bb2-dba00d57d167

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s10100-022-00824-2.pdf
Size:
516.43 KB
Format:
Adobe Portable Document Format

Collections