Técnicas de Big Data y Machine-Learning para Recomendador
Bibliográfico

Gonzalo Fernández, Alejandro

Técnicas de Big Data y Machine-Learning para Recomendador Bibliográfico

dc.contributor.advisor	Gregorio Rodríguez, Carlos
dc.contributor.author	Gonzalo Fernández, Alejandro
dc.date.accessioned	2023-06-17T10:18:30Z
dc.date.available	2023-06-17T10:18:30Z
dc.date.issued	2020-07
dc.description	Calificación: 9,3
dc.description.abstract	En este Trabajo Fin de Máster se presenta la idea de mejorar los recomendadores bibliográficos. Por ello presentamos los distintos sistemas de recomendación en un primer capítulo, el procesamiento de lenguaje natural en un segundo y en el tercero y cuarto capítulo presentamos el problema y nuestra hipótesis de mejora junto con su implementación. La principal idea es crear un clasificador en diferentes temáticas: ciencia ficción, histórico, policíaco, etc. Esta clasificación servirá para realizar un esquema de un sistema de recomendación bibliográfico que proporciona recomendaciones basadas en los perfiles temáticos de los usuarios. Para solventar el problema del gran tamaño de estos datos usaremos la Ley de Zipf como pieza fundamental.
dc.description.abstract	In this Master’s Project the idea of improving literary recommendations is presented. Different recommendation systems are discussed in the first chapter and the second chapter discusses natural language processing. In the third and fourth chapters, the problem is presented along with an improvement hypothesis and its implementation. The main idea is to create a classifier for different genres: science fiction, historical fiction, crime, etc. This classification will serve as an outline of a literary recommendation system that provides recommendations based on the thematic profiles of users. A solution based on Zipf’s Law was used to deal with the large dataset.
dc.description.department	Sección Deptal. de Sistemas Informáticos y Computación
dc.description.faculty	Fac. de Ciencias Matemáticas
dc.description.refereed	TRUE
dc.description.status	submitted
dc.eprint.id	https://eprints.ucm.es/id/eprint/68580
dc.identifier.uri	https://hdl.handle.net/20.500.14352/9237
dc.language.iso	spa
dc.master.title	Tratamiento estadístico computacional de la información
dc.rights	Atribución-NoComercial-CompartirIgual 3.0 España
dc.rights.accessRights	open access
dc.rights.uri	https://creativecommons.org/licenses/by-nc-sa/3.0/es/
dc.subject.cdu	004
dc.subject.cdu	51
dc.subject.cdu	519.22
dc.subject.keyword	Sistemas de Recomendación
dc.subject.keyword	Big Data
dc.subject.keyword	Machine Learning
dc.subject.keyword	Procesamiento del Lenguaje Natural
dc.subject.keyword	Ley de Zipf
dc.subject.keyword	Recommendation Systems
dc.subject.keyword	Natural Language Processing
dc.subject.keyword	Zipf law
dc.subject.ucm	Informática (Informática)
dc.subject.ucm	Matemáticas (Matemáticas)
dc.subject.ucm	Estadística
dc.subject.unesco	1203.17 Informática
dc.subject.unesco	12 Matemáticas
dc.subject.unesco	1209 Estadística
dc.title	Técnicas de Big Data y Machine-Learning para Recomendador Bibliográfico
dc.title.alternative	Big Data and Machine Learning Techniques for Recommendation Systems
dc.type	master thesis
dcterms.references	[1] Haifa Alharthi, Diana Inkpen, and Stan Szpakowicz, A survey of book recommender systems, Journal of Intelligent Information Systems 51 (2018), no. 1, 139–160. [2] Chris Anderson, The long tail: Why the future of business is selling less of more, Hachette Books, 2006. [3] Shlomo Argamon and Shlomo Levitan, Measuring the usefulness of function words for authorship attribution, Proceedings of the 2005 ACH/ALLC Conference, 2005, pp. 4–7. [4] Joeran Beel, Bela Gipp, Stefan Langer, and Corinna Breitinger, Research paper recommender systems: A literature survey, International Journal on Digital Libraries (2015), 1–34. [5] Alejandro Bellogín, Iván Cantador, and Pablo Castells, A study of heterogeneity in recommendations for a social music service, Proceedings of the 1st International Workshop on Information Heterogeneity and Fusion in Recommender Systems, 2010, pp. 1–8. [6] Steven Bird, Ewan Klein, and Edward Loper, Natural language processing with python: analyzing text with the natural language toolkit, . O ’Reilly Media, Inc.", 2009. [7] Stephan Bloehdorn and Andreas Hotho, Boosting for text classification with semantic features, International workshop on knowledge discovery on the web, Springer, 2004, pp. 149–166. [8] Robin Burke, Hybrid recommender systems: Survey and experiments, User modeling and user-adapted interaction 12 (2002), no. 4, 331–370. [9] W Bruce Croft, Donald Metzler, and Trevor Strohman, Search engines: Information retrieval in practice, vol. 520, Addison-Wesley Reading, 2010. [10] Niladri Sekhar Dash and Selvaraj Arulmozi, History, features, and typology of language corpora, Springer, 2018. [11] Federación de Gremios de Editores de España (FGEE), Barómetro de hábitos de lectura y compra de libros en españa 2019, http://www.fande.es/documental/2017/Ficheros/NP_Barometro_2019.pdf, [Online; accedido 21-05-2020]. [12] Gideon Dror, Noam Koenigstein, Yehuda Koren, and Markus Weimer, The yahoo! music dataset and kdd-cup’11, Proceedings of the 2011 International Conference on KDD Cup 2011-Volume 18, 2011, pp. 3–18. [13] Martin Gerlach and Francesc Font-Clos, A standardized project gutenberg corpus for statistical analysis of natural language and quantitative linguistics, Entropy 22 (2020), no. 1, 126. [14] David Goldberg, David Nichols, Brian M Oki, and Douglas Terry, Using collaborative filtering to weave an information tapestry, Communications of the ACM 35 (1992), no. 12, 61–70. [15] Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins, Eigentaste: A constant time collaborative filtering algorithm, information retrieval 4 (2001), no. 2, 133–151. [16] F Maxwell Harper and Joseph A Konstan, The movielens datasets: History and context, Acm transactions on interactive intelligent systems (tiis) 5 (2015), no. 4, 1–19. [17] Trevor Hastie, Robert Tibshirani, and Jerome Friedman, The elements of statistical learning: data mining, inference, and prediction, Springer Science & Business Media, 2009. [18] Jonathan L Herlocker, Joseph A Konstan, Loren G Terveen, and John T Riedl, Evaluating collaborative filtering recommender systems, ACM Transactions on Information Systems (TOIS) 22 (2004), no. 1, 5–53. [19] Zan Huang, Wingyan Chung, Thian-Huat Ong, and Hsinchun Chen, A graph-based recommender system for digital library, Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, 2002, pp. 65–73. [20] Karen Sparck Jones, A statistical interpretation of term specificity and its application in retrieval, Journal of documentation (1972). [21] Isabel Moreno-Sánchez, Francesc Font-Clos, and Álvaro Corral, Large-scale analysis of zipf ’s law in english texts, PloS one 11 (2016), no. 1. [22] Mark EJ Newman, Power laws, pareto distributions and zipf ’s law, Contemporary physics 46 (2005), no. 5, 323–351. [23] Sebastian Raschka and Vahid Mirjalili, Python machine learning: Machine learning and deep learning with python, scikit-learn, and tensorflow 2, Packt Publishing Ltd, 2019. [24] Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Riedl, Grouplens: an open architecture for collaborative filtering of netnews, Proceedings of the 1994 ACM conference on Computer supported cooperative work, 1994, pp. 175–186. [25] Elaine Rich, User modeling via stereotypes, Cognitive science 3 (1979), no. 4, 329–354. [26] Jake Ryland Williams, James P Bagrow, Christopher M Danforth, and Peter Sheridan Dodds, Text mixing shapes the anatomy of rank-frequency distributions: A modern zipfian mechanics for natural language, arXiv preprint arXiv:1409.3870 (2014). [27] Efstathios Stamatatos, A survey of modern authorship attribution methods, Journal of the American Society for information Science and Technology 60 (2009), no. 3, 538-556. [28] Alexander Strehl, Joydeep Ghosh, and Raymond Mooney, Impact of similarity measures on web-page clustering, Workshop on artificial intelligence for web search (AAAI 2000), vol. 58, 2000, p. 64. [29] Sergios Theodoridis, Aggelos Pikrakis, Konstantinos Koutroumbas, and Dionisis Cavouras, Introduction to pattern recognition: a matlab approach, Academic Press, 2010. [30] Andreas Töscher, Michael Jahrer, and Robert M Bell, The bigchaos solution to the netflix grand prize, Netflix prize documentation (2009), 1–52. [31] C. J. van Rijsbergen, Information retrieval, Butterworth, 1979. [32] André Vellino, Usage-based vs. citation-based methods for recommending scholarly research articles, arXiv preprint arXiv:1303.7149 (2013). [33] Tao Zhou, Zoltán Kuscsik, Jian-Guo Liu, Matúš Medo, Joseph Rushton Wakeling, and Yi-Cheng Zhang, Solving the apparent diversity-accuracy dilemma of recommender systems, Proceedings of the National Academy of Sciences 107 (2010), no. 10, 4511–4515. [34] Cai-Nicolas Ziegler, Sean M McNee, Joseph A Konstan, and Georg Lausen, Improving recommendation lists through topic diversification, Proceedings of the 14th international conference on World Wide Web, 2005, pp. 22–32. [35] George Kingsley Zipf, Human behavior and the principle of least effort., (1949).
dspace.entity.type	Publication
relation.isAdvisorOfPublication	05a01c46-aac8-42b2-a6bc-4b95860cf5bf
relation.isAdvisorOfPublication.latestForDiscovery	05a01c46-aac8-42b2-a6bc-4b95860cf5bf

Download

Original bundle

Now showing 1 - 1 of 1

Name:: TFM.pdf
Size:: 1.18 MB
Format:: Adobe Portable Document Format

Download

Collections

Trabajos Fin de Master (TFM)