Double-weighted kNN: a simple and efficient variant with embedded feature selection

Calviño Martínez, Aída; Moreno-Ribera, Almudena

Double-weighted kNN: a simple and efficient variant with embedded feature selection

dc.contributor.author	Calviño Martínez, Aída
dc.contributor.author	Moreno-Ribera, Almudena
dc.contributor.editor	Krishen, Anjala S.
dc.contributor.editor	Petrescu, Maria
dc.date.accessioned	2024-04-10T15:10:23Z
dc.date.available	2024-04-10T15:10:23Z
dc.date.issued	2024
dc.description.abstract	Predictive modeling aims at providing estimates of an unknown variable, the target, from a set of known ones, the input. The k Nearest Neighbors (kNN) is one of the best-known predictive algorithms due to its simplicity and well behavior. However, this class of models has some drawbacks, such as the non-robustness to the existence of irrelevant input features or the need to transform qualitative variables into dummies, with the corresponding loss of information for ordinal ones. In this work, a kNN regression variant, easily adaptable for classification purposes, is suggested. The proposal allows dealing with all types of input variables while embedding feature selection in a simple and efficient manner, reducing the tuning phase. More precisely, making use of the weighted Gower distance, we develop a powerful tool to cope with these inconveniences. Finally, to boost the tool predictive power, a second weighting scheme is added to the neighbors. The proposed method is applied to a collection of 20 data sets, different in size, data type, and distribution of the target variable. Moreover, the results are compared with the previously proposed kNN variants, showing its supremacy, particularly when the weighting scheme is based on non-linear association measures.
dc.description.department	Depto. de Estadística y Ciencia de los Datos
dc.description.faculty	Fac. de Estudios Estadísticos
dc.description.refereed	TRUE
dc.description.sponsorship	European Commission
dc.description.sponsorship	Comunidad de Madrid
dc.description.status	pub
dc.identifier.citation	Moreno-Ribera, A., Calviño, A. Double-weighted kNN: a simple and efficient variant with embedded feature selection. J Market Anal (2024). https://doi.org/10.1057/s41270-024-00302-5
dc.identifier.doi	10.1057/s41270-024-00302-5
dc.identifier.officialurl	https://doi.org/10.1057/s41270-024-00302-5
dc.identifier.relatedurl	https://link.springer.com/article/10.1057/s41270-024-00302-5
dc.identifier.uri	https://hdl.handle.net/20.500.14352/102967
dc.journal.title	Journal of marketing Analytics
dc.language.iso	eng
dc.page.final	11
dc.page.initial	1
dc.publisher	Palgrave MacMillan
dc.relation.projectID	CT36/22-04-UCM-INV
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	en
dc.rights.accessRights	metadata only access
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject.cdu	519.233.5
dc.subject.cdu	519.237
dc.subject.cdu	519.2
dc.subject.cdu	519.862.6
dc.subject.cdu	519.8
dc.subject.keyword	Gower distance
dc.subject.keyword	Ordinal variables
dc.subject.keyword	Machine learning
dc.subject.keyword	Regression
dc.subject.keyword	Weighting scheme
dc.subject.ucm	Análisis Multivariante
dc.subject.ucm	Econometría (Estadística)
dc.subject.ucm	Estadística matemática (Estadística)
dc.subject.ucm	Investigación operativa (Estadística)
dc.subject.unesco	1209.04 Teoría y Proceso de decisión
dc.subject.unesco	1209.09 Análisis Multivariante
dc.subject.unesco	1209.04 Teoría y Proceso de decisión
dc.subject.unesco	1209.14 Técnicas de Predicción Estadística
dc.title	Double-weighted kNN: a simple and efficient variant with embedded feature selection
dc.type	journal article
dc.type.hasVersion	CVoR
dspace.entity.type	Publication
relation.isAuthorOfPublication	9910901c-7e34-482c-b57c-470f4e445cfb
relation.isAuthorOfPublication.latestForDiscovery	9910901c-7e34-482c-b57c-470f4e445cfb

Collections

Artículos

Double-weighted kNN: a simple and efficient variant with embedded feature selection

Download

Collections