Aviso: para depositar documentos, por favor, inicia sesión e identifícate con tu cuenta de correo institucional de la UCM con el botón MI CUENTA UCM. No emplees la opción AUTENTICACIÓN CON CONTRASEÑA
 

Double-weighted kNN: a simple and efficient variant with embedded feature selection

dc.contributor.authorCalviño Martínez, Aída
dc.contributor.authorMoreno-Ribera, Almudena
dc.contributor.editorKrishen, Anjala S.
dc.contributor.editorPetrescu, Maria
dc.date.accessioned2024-04-10T15:10:23Z
dc.date.available2024-04-10T15:10:23Z
dc.date.issued2024
dc.description.abstractPredictive modeling aims at providing estimates of an unknown variable, the target, from a set of known ones, the input. The k Nearest Neighbors (kNN) is one of the best-known predictive algorithms due to its simplicity and well behavior. However, this class of models has some drawbacks, such as the non-robustness to the existence of irrelevant input features or the need to transform qualitative variables into dummies, with the corresponding loss of information for ordinal ones. In this work, a kNN regression variant, easily adaptable for classification purposes, is suggested. The proposal allows dealing with all types of input variables while embedding feature selection in a simple and efficient manner, reducing the tuning phase. More precisely, making use of the weighted Gower distance, we develop a powerful tool to cope with these inconveniences. Finally, to boost the tool predictive power, a second weighting scheme is added to the neighbors. The proposed method is applied to a collection of 20 data sets, different in size, data type, and distribution of the target variable. Moreover, the results are compared with the previously proposed kNN variants, showing its supremacy, particularly when the weighting scheme is based on non-linear association measures.
dc.description.departmentDepto. de Estadística y Ciencia de los Datos
dc.description.facultyFac. de Estudios Estadísticos
dc.description.refereedTRUE
dc.description.sponsorshipEuropean Commission
dc.description.sponsorshipComunidad de Madrid
dc.description.statuspub
dc.identifier.citationMoreno-Ribera, A., Calviño, A. Double-weighted kNN: a simple and efficient variant with embedded feature selection. J Market Anal (2024). https://doi.org/10.1057/s41270-024-00302-5
dc.identifier.doi10.1057/s41270-024-00302-5
dc.identifier.officialurlhttps://doi.org/10.1057/s41270-024-00302-5
dc.identifier.relatedurlhttps://link.springer.com/article/10.1057/s41270-024-00302-5
dc.identifier.urihttps://hdl.handle.net/20.500.14352/102967
dc.journal.titleJournal of marketing Analytics
dc.language.isoeng
dc.page.final11
dc.page.initial1
dc.publisherPalgrave MacMillan
dc.relation.projectIDCT36/22-04-UCM-INV
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internationalen
dc.rights.accessRightsmetadata only access
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject.cdu519.233.5
dc.subject.cdu519.237
dc.subject.cdu519.2
dc.subject.cdu519.862.6
dc.subject.cdu519.8
dc.subject.keywordGower distance
dc.subject.keywordOrdinal variables
dc.subject.keywordMachine learning
dc.subject.keywordRegression
dc.subject.keywordWeighting scheme
dc.subject.ucmAnálisis Multivariante
dc.subject.ucmEconometría (Estadística)
dc.subject.ucmEstadística matemática (Estadística)
dc.subject.ucmInvestigación operativa (Estadística)
dc.subject.unesco1209.04 Teoría y Proceso de decisión
dc.subject.unesco1209.09 Análisis Multivariante
dc.subject.unesco1209.04 Teoría y Proceso de decisión
dc.subject.unesco1209.14 Técnicas de Predicción Estadística
dc.titleDouble-weighted kNN: a simple and efficient variant with embedded feature selection
dc.typejournal article
dc.type.hasVersionCVoR
dspace.entity.typePublication
relation.isAuthorOfPublication9910901c-7e34-482c-b57c-470f4e445cfb
relation.isAuthorOfPublication.latestForDiscovery9910901c-7e34-482c-b57c-470f4e445cfb

Download

Collections