Para depositar en Docta Complutense, identifícate con tu correo @ucm.es en el SSO institucional: Haz clic en el desplegable de INICIO DE SESIÓN situado en la parte superior derecha de la pantalla. Introduce tu correo electrónico y tu contraseña de la UCM y haz clic en el botón MI CUENTA UCM, no autenticación con contraseña.
 

A Method to Generate Soft Reference Data for Topic Identification

dc.conference.dateJune 15–19, 2020
dc.conference.placeLisboa, Portugal
dc.conference.titleInternational Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2020)
dc.contributor.authorVélez Serrano, Daniel
dc.contributor.authorVillarino, Guillermo
dc.contributor.authorRodríguez González, Juan Tinguaro
dc.contributor.authorGómez González, Daniel
dc.date.accessioned2025-01-20T16:03:10Z
dc.date.available2025-01-20T16:03:10Z
dc.date.issued2020
dc.description.abstractText mining and topic identification models are becoming increasingly relevant to extract value from the huge amount of unstructured textual information that companies obtain from their users and clients nowadays. Soft approaches to these problems are also gaining relevance, as in some contexts it may be unrealistic to assume that any document has to be associated to a single topic without any further consideration of the involved uncertainties. However, there is an almost total lack of reference documents allowing a proper assessment of the performance of soft classifiers in such soft topic identification tasks. To address this lack, in this paper a method is proposed that generates topic identification reference documents with a soft but objective nature, and which proceeds by combining, in random but known proportions, phrases of existing documents dealing with different topics. We also provide a computational study illustrating the application of the proposed method on a well-known benchmark for topic identification, as well as showing the possibility of carrying out an informative evaluation of soft classifiers in the context of soft topic identification.
dc.description.departmentDepto. de Estadística e Investigación Operativa
dc.description.departmentDepto. de Estadística y Ciencia de los Datos
dc.description.facultyFac. de Ciencias Matemáticas
dc.description.facultyFac. de Estudios Estadísticos
dc.description.facultyInstituto de Matemática Interdisciplinar (IMI)
dc.description.refereedTRUE
dc.description.sponsorshipMinisterio de Ciencia, Innovación y Universidades
dc.description.sponsorshipUniversidad Complutense de Madrid
dc.description.statuspub
dc.identifier.citationVélez, D., Villarino, G., Rodríguez, J.T., Gómez, D. (2020). A Method to Generate Soft Reference Data for Topic Identification. In: Lesot, MJ., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2020. Communications in Computer and Information Science, vol 1239. Springer, Cham. https://doi.org/10.1007/978-3-030-50153-2_5
dc.identifier.doi10.1007/978-3-030-50153-2_5
dc.identifier.isbn9783030501525
dc.identifier.isbn9783030501532
dc.identifier.issn1865-0929
dc.identifier.issn1865-0937
dc.identifier.officialurlhttps://doi.org/ 10.1007/978-3-030-50153-2_5
dc.identifier.urihttps://hdl.handle.net/20.500.14352/115196
dc.language.isoeng
dc.page.final67
dc.page.initial54
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PGC2018-096509-B-I00/ES/GESTION INTELIGENTE DE INFORMACION BORROSA/
dc.relation.projectIDUCM Research Group 910149
dc.rights.accessRightsrestricted access
dc.subject.keywordSoft classification
dc.subject.keywordText mining
dc.subject.keywordTopic identification
dc.subject.ucmEstadística matemática (Matemáticas)
dc.subject.ucmInvestigación operativa (Matemáticas)
dc.subject.unesco1207 Investigación Operativa
dc.subject.unesco1209 Estadística
dc.titleA Method to Generate Soft Reference Data for Topic Identification
dc.typeconference paper
dspace.entity.typePublication
relation.isAuthorOfPublication1375c631-ecbd-4b51-b213-c7d4148c3eba
relation.isAuthorOfPublicationddad170a-793c-4bdc-b983-98d313c81b03
relation.isAuthorOfPublication4dcf8c54-8545-4232-8acf-c163330fd0fe
relation.isAuthorOfPublication.latestForDiscovery1375c631-ecbd-4b51-b213-c7d4148c3eba

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
velez_softopicID.pdf
Size:
502.66 KB
Format:
Adobe Portable Document Format

Collections