Aviso: para depositar documentos, por favor, inicia sesión e identifícate con tu cuenta de correo institucional de la UCM con el botón MI CUENTA UCM. No emplees la opción AUTENTICACIÓN CON CONTRASEÑA
 

Técnicas de Big Data y Machine-Learning para Recomendador Bibliográfico

dc.contributor.advisorGregorio Rodríguez, Carlos
dc.contributor.authorGonzalo Fernández, Alejandro
dc.date.accessioned2023-06-17T10:18:30Z
dc.date.available2023-06-17T10:18:30Z
dc.date.issued2020-07
dc.descriptionCalificación: 9,3
dc.description.abstractEn este Trabajo Fin de Máster se presenta la idea de mejorar los recomendadores bibliográficos. Por ello presentamos los distintos sistemas de recomendación en un primer capítulo, el procesamiento de lenguaje natural en un segundo y en el tercero y cuarto capítulo presentamos el problema y nuestra hipótesis de mejora junto con su implementación. La principal idea es crear un clasificador en diferentes temáticas: ciencia ficción, histórico, policíaco, etc. Esta clasificación servirá para realizar un esquema de un sistema de recomendación bibliográfico que proporciona recomendaciones basadas en los perfiles temáticos de los usuarios. Para solventar el problema del gran tamaño de estos datos usaremos la Ley de Zipf como pieza fundamental.
dc.description.abstractIn this Master’s Project the idea of improving literary recommendations is presented. Different recommendation systems are discussed in the first chapter and the second chapter discusses natural language processing. In the third and fourth chapters, the problem is presented along with an improvement hypothesis and its implementation. The main idea is to create a classifier for different genres: science fiction, historical fiction, crime, etc. This classification will serve as an outline of a literary recommendation system that provides recommendations based on the thematic profiles of users. A solution based on Zipf’s Law was used to deal with the large dataset.
dc.description.departmentSección Deptal. de Sistemas Informáticos y Computación
dc.description.facultyFac. de Ciencias Matemáticas
dc.description.refereedTRUE
dc.description.statussubmitted
dc.eprint.idhttps://eprints.ucm.es/id/eprint/68580
dc.identifier.urihttps://hdl.handle.net/20.500.14352/9237
dc.language.isospa
dc.master.titleTratamiento estadístico computacional de la información
dc.rightsAtribución-NoComercial-CompartirIgual 3.0 España
dc.rights.accessRightsopen access
dc.rights.urihttps://creativecommons.org/licenses/by-nc-sa/3.0/es/
dc.subject.cdu004
dc.subject.cdu51
dc.subject.cdu519.22
dc.subject.keywordSistemas de Recomendación
dc.subject.keywordBig Data
dc.subject.keywordMachine Learning
dc.subject.keywordProcesamiento del Lenguaje Natural
dc.subject.keywordLey de Zipf
dc.subject.keywordRecommendation Systems
dc.subject.keywordNatural Language Processing
dc.subject.keywordZipf law
dc.subject.ucmInformática (Informática)
dc.subject.ucmMatemáticas (Matemáticas)
dc.subject.ucmEstadística
dc.subject.unesco1203.17 Informática
dc.subject.unesco12 Matemáticas
dc.subject.unesco1209 Estadística
dc.titleTécnicas de Big Data y Machine-Learning para Recomendador Bibliográfico
dc.title.alternativeBig Data and Machine Learning Techniques for Recommendation Systems
dc.typemaster thesis
dcterms.references[1] Haifa Alharthi, Diana Inkpen, and Stan Szpakowicz, A survey of book recommender systems, Journal of Intelligent Information Systems 51 (2018), no. 1, 139–160. [2] Chris Anderson, The long tail: Why the future of business is selling less of more, Hachette Books, 2006. [3] Shlomo Argamon and Shlomo Levitan, Measuring the usefulness of function words for authorship attribution, Proceedings of the 2005 ACH/ALLC Conference, 2005, pp. 4–7. [4] Joeran Beel, Bela Gipp, Stefan Langer, and Corinna Breitinger, Research paper recommender systems: A literature survey, International Journal on Digital Libraries (2015), 1–34. [5] Alejandro Bellogín, Iván Cantador, and Pablo Castells, A study of heterogeneity in recommendations for a social music service, Proceedings of the 1st International Workshop on Information Heterogeneity and Fusion in Recommender Systems, 2010, pp. 1–8. [6] Steven Bird, Ewan Klein, and Edward Loper, Natural language processing with python: analyzing text with the natural language toolkit, . O ’Reilly Media, Inc.", 2009. [7] Stephan Bloehdorn and Andreas Hotho, Boosting for text classification with semantic features, International workshop on knowledge discovery on the web, Springer, 2004, pp. 149–166. [8] Robin Burke, Hybrid recommender systems: Survey and experiments, User modeling and user-adapted interaction 12 (2002), no. 4, 331–370. [9] W Bruce Croft, Donald Metzler, and Trevor Strohman, Search engines: Information retrieval in practice, vol. 520, Addison-Wesley Reading, 2010. [10] Niladri Sekhar Dash and Selvaraj Arulmozi, History, features, and typology of language corpora, Springer, 2018. [11] Federación de Gremios de Editores de España (FGEE), Barómetro de hábitos de lectura y compra de libros en españa 2019, http://www.fande.es/documental/2017/Ficheros/NP_Barometro_2019.pdf, [Online; accedido 21-05-2020]. [12] Gideon Dror, Noam Koenigstein, Yehuda Koren, and Markus Weimer, The yahoo! music dataset and kdd-cup’11, Proceedings of the 2011 International Conference on KDD Cup 2011-Volume 18, 2011, pp. 3–18. [13] Martin Gerlach and Francesc Font-Clos, A standardized project gutenberg corpus for statistical analysis of natural language and quantitative linguistics, Entropy 22 (2020), no. 1, 126. [14] David Goldberg, David Nichols, Brian M Oki, and Douglas Terry, Using collaborative filtering to weave an information tapestry, Communications of the ACM 35 (1992), no. 12, 61–70. [15] Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins, Eigentaste: A constant time collaborative filtering algorithm, information retrieval 4 (2001), no. 2, 133–151. [16] F Maxwell Harper and Joseph A Konstan, The movielens datasets: History and context, Acm transactions on interactive intelligent systems (tiis) 5 (2015), no. 4, 1–19. [17] Trevor Hastie, Robert Tibshirani, and Jerome Friedman, The elements of statistical learning: data mining, inference, and prediction, Springer Science & Business Media, 2009. [18] Jonathan L Herlocker, Joseph A Konstan, Loren G Terveen, and John T Riedl, Evaluating collaborative filtering recommender systems, ACM Transactions on Information Systems (TOIS) 22 (2004), no. 1, 5–53. [19] Zan Huang, Wingyan Chung, Thian-Huat Ong, and Hsinchun Chen, A graph-based recommender system for digital library, Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, 2002, pp. 65–73. [20] Karen Sparck Jones, A statistical interpretation of term specificity and its application in retrieval, Journal of documentation (1972). [21] Isabel Moreno-Sánchez, Francesc Font-Clos, and Álvaro Corral, Large-scale analysis of zipf ’s law in english texts, PloS one 11 (2016), no. 1. [22] Mark EJ Newman, Power laws, pareto distributions and zipf ’s law, Contemporary physics 46 (2005), no. 5, 323–351. [23] Sebastian Raschka and Vahid Mirjalili, Python machine learning: Machine learning and deep learning with python, scikit-learn, and tensorflow 2, Packt Publishing Ltd, 2019. [24] Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Riedl, Grouplens: an open architecture for collaborative filtering of netnews, Proceedings of the 1994 ACM conference on Computer supported cooperative work, 1994, pp. 175–186. [25] Elaine Rich, User modeling via stereotypes, Cognitive science 3 (1979), no. 4, 329–354. [26] Jake Ryland Williams, James P Bagrow, Christopher M Danforth, and Peter Sheridan Dodds, Text mixing shapes the anatomy of rank-frequency distributions: A modern zipfian mechanics for natural language, arXiv preprint arXiv:1409.3870 (2014). [27] Efstathios Stamatatos, A survey of modern authorship attribution methods, Journal of the American Society for information Science and Technology 60 (2009), no. 3, 538-556. [28] Alexander Strehl, Joydeep Ghosh, and Raymond Mooney, Impact of similarity measures on web-page clustering, Workshop on artificial intelligence for web search (AAAI 2000), vol. 58, 2000, p. 64. [29] Sergios Theodoridis, Aggelos Pikrakis, Konstantinos Koutroumbas, and Dionisis Cavouras, Introduction to pattern recognition: a matlab approach, Academic Press, 2010. [30] Andreas Töscher, Michael Jahrer, and Robert M Bell, The bigchaos solution to the netflix grand prize, Netflix prize documentation (2009), 1–52. [31] C. J. van Rijsbergen, Information retrieval, Butterworth, 1979. [32] André Vellino, Usage-based vs. citation-based methods for recommending scholarly research articles, arXiv preprint arXiv:1303.7149 (2013). [33] Tao Zhou, Zoltán Kuscsik, Jian-Guo Liu, Matúš Medo, Joseph Rushton Wakeling, and Yi-Cheng Zhang, Solving the apparent diversity-accuracy dilemma of recommender systems, Proceedings of the National Academy of Sciences 107 (2010), no. 10, 4511–4515. [34] Cai-Nicolas Ziegler, Sean M McNee, Joseph A Konstan, and Georg Lausen, Improving recommendation lists through topic diversification, Proceedings of the 14th international conference on World Wide Web, 2005, pp. 22–32. [35] George Kingsley Zipf, Human behavior and the principle of least effort., (1949).
dspace.entity.typePublication
relation.isAdvisorOfPublication05a01c46-aac8-42b2-a6bc-4b95860cf5bf
relation.isAdvisorOfPublication.latestForDiscovery05a01c46-aac8-42b2-a6bc-4b95860cf5bf

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TFM.pdf
Size:
1.18 MB
Format:
Adobe Portable Document Format