Leveraging entities knowledge to bypass the Cold-Start recommender problem on Microsoft News Dataset

Thumbnail Image
Official URL
Full text at PDC
Publication Date
Advisors (or tutors)
Journal Title
Journal ISSN
Volume Title
Google Scholar
Research Projects
Organizational Units
Journal Issue
Online news has been a hot topic since 2002 when the New York Times published its news RSS feed (Doree, 2007). At this time, users subscribed to the feed using their Netscape Navigator browser and received a daily update of the titles published in the newspaper. This feed was a service with no cost for the user and no monetary income for the journal. Nowadays, user engagement to services is one of the most profitable features for information companies, so they pay more attention to what they show to the users. To keep the user engaged, news aggregators offer their clients relevant information based on their interests. In the early days, companies asked the user the definitive source of the information he wanted to read to gather the user's preferences. Later, the aggregators ask the user about the kind and features of the information of interest. Still, most current systems do not ask for explicit information from the users but model their behavior from their navigation history. Recommendations arise on top of user's interest models, matching interests and news features. The intersection of both was first done by exact classification match and become fuzzier every moment till now where we have stickiness probabilities. Users and news featuring have walked a long path from its manual classification to the machine learning classification techniques used here, improving the user's recommendations. This study dives into the Microsoft News Dataset (Wu et al., 2020), analyzing the users' behavior in it. The main objective was to estimate the click gesture given the news clicked by users previously. This prediction is helpful for any news aggregator portal to show the most relevant news for the user's interest, saving time to the user, reducing the resources consumed by the portal, and, most importantly, improving its engagement score. Showing the users the most relevant news is a cold-start problem, where portals do not have enough collaborative information about news themselves before they become obsolete. News is a volatile asset, which dramatically depreciates as time passes. There is not enough time for an article to get relevant information from the users' community to profile collaborative filtering (CF) score. Therefore, this technique is excluded from the hypothesis explained hereabove. Instead, featuring engineering and featuring inference were used to predict the importance of a news article for a user.
Doree, J. (2007). RSS: A Brief Introduction. The Journal of Manual & Manipulative Therapy, 15(1), 57–58. PubMed. Ford, J. B. (2018). What Do We Know About Celebrity Endorsement in Advertising? Journal of Advertising Research, 58(1), 1–2. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. Gulla, J. A., Zhang, L., Liu, P., Özgöbek, Ö., & Su, X. (2017). The Adressa Dataset for News Recommendation. Proceedings of the International Conference on Web Intelligence, 1042–1048. Hamilton, W. L. (2020). Graph Representation Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 14(3), 1–159. Hwang, K., & Zhang, Q. (2018). Influence of parasocial relationship between digital celebrities and their followers on followers’ purchase and electronic word-of-mouth intentions, and persuasion knowledge. Computers in Human Behavior, 87, 155–173. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning (Vol. 103). Springer New York. Okura, S., Tagami, Y., Ono, S., & Tajima, A. (2017). Embedding-Based News Recommendation for Millions of Users. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1933–1942. O’Sullivan, D., & McCallig, J. (2012). Customer satisfaction, earnings and firm value. European Journal of Marketing, 46(6), 827–843. Portela García-Miguel, J. (2019). Machine Learning. Introducción. UCM. Wang, H., Zhang, F., Wang, J., Zhao, M., Li, W., Xie, X., & Guo, M. (2018). RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems. Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 417–426. Wang, H., Zhang, F., Zhao, M., Li, W., Xie, X., & Guo, M. (2019). Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation. The World Wide Web Conference on - WWW ’19, 2000–2010. Wang, X., He, X., Cao, Y., Liu, M., & Chua, T.-S. (2019). KGAT: Knowledge Graph Attention Network for Recommendation. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 950–958. Wu, F., Qiao, Y., Chen, J.-H., Wu, C., Qi, T., Lian, J., Liu, D., Xie, X., Gao, J., Wu, W., & Zhou, M. (2020, July). MIND: A Large-scale Dataset for News Recommendation. ACL 2020. Yu, S., & Kak, S. (n.d.). An Empirical Study of How Users Adopt Famous Entities. 7.