Aviso: para depositar documentos, por favor, inicia sesión e identifícate con tu cuenta de correo institucional de la UCM con el botón MI CUENTA UCM. No emplees la opción AUTENTICACIÓN CON CONTRASEÑA
 

Twitter user multiclass classification during US 2020 electoral campaign

dc.contributor.advisorGómez González, Daniel
dc.contributor.advisorRobles Morales, José Manuel
dc.contributor.advisorCaballero Roldán, Rafael
dc.contributor.authorMrzic, Erol
dc.date.accessioned2023-06-16T14:49:33Z
dc.date.available2023-06-16T14:49:33Z
dc.date.issued2021-09
dc.description.abstractDue to the unprecedented rise of data content on social media over the last decade, an opportunity for data-based analysis has become a norm in the modern world. Implementing Machine Learning algorithms and Data Science methods virtually every industry changed. One of the most active researching areas in Machine Learning today is Natural Language Processing (NLP), a field of Artificial Intelligence (AI) that allows computers to read, understand, and deduce meaning from human languages. In this paper we applied Natural Language Processing methods and algorithms on two Twitter datasets collected during the US 2020 elections in order to group both users and tweets in multiple categories based on their support for the candidate. The purpose of this work was to establish the possibility to correctly classify these individuals and their individual tweets based on their aggregated opinions and to create a predictive classification model focusing on text analysis. As a result, we constructed, trained and tested multiple models that can help predict the probability of the user’s sentiment toward the candidates based on their tweets. We showed that in 63 % of the cases, we can present high probability of a user’s sentiment classification, according to the amalgamation of their tweets.en
dc.description.departmentDepto. de Estadística y Ciencia de los Datos
dc.description.facultyFac. de Estudios Estadísticos
dc.description.refereedFALSE
dc.description.statussubmitted
dc.eprint.idhttps://eprints.ucm.es/id/eprint/68413
dc.identifier.urihttps://hdl.handle.net/20.500.14352/5167
dc.language.isoeng
dc.master.titleMáster en Minería de Datos e Inteligencia de Negocios
dc.rights.accessRightsopen access
dc.subject.cdu004.85
dc.subject.keywordData Science
dc.subject.keywordMachine Learning
dc.subject.keywordSentiment analysis
dc.subject.keywordMulticlass prediction
dc.subject.keywordNatural Language Processing
dc.subject.keywordAprendizaje automático (Inteligencia artificial)
dc.subject.keywordProceso de lenguaje natural
dc.subject.ucmInformática (Informática)
dc.subject.ucmEstadística
dc.subject.ucmTécnicas de Investigación Social
dc.subject.unesco1203.17 Informática
dc.subject.unesco1209 Estadística
dc.subject.unesco6302.03 Diseño de Investigación Social
dc.titleTwitter user multiclass classification during US 2020 electoral campaign
dc.typemaster thesis
dcterms.references[1] Morales, J. M. R. (2011). Ciudadanía digital: Una introducción a un nuevo concepto de ciudadano. Editorial UOC. [2] Dey, P., Kothari, P. K., & Nath, S. (2019, January). The social network effect on surprise in elections. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data (pp. 1-9). [3] Ceron, A., Curini, L., & Iacus, S. M. (2015). Using Sentiment Analysis to Monitor Electoral Campaigns: Method Matters—Evidence From the United States and Italy. Social Science Computer Review, 33(1), 3–20. https://doi.org/10.1177/0894439314521983 [4] Patil, A. P., Doshi, D., Dalsaniya, D., & Rashmi, B. S. (2017, September). Applying Machine Learning Techniques for Sentiment Analysis in the Case Study of Indian Politics. In International Symposium on Signal Processing and Intelligent Recognition Systems (pp. 351-358). Springer, Cham. [5] ‘Cambridge Analytica CEO Claims Influence on U.S. Election, Facebook Questioned’. Reuters, 20 March 2018, sec. Media and Telecoms. https://www.reuters.com/article/us-facebook-cambridge-analytica-idUSKBN1GW1SG. [6] Detrow, Scott. ‘What Did Cambridge Analytica Do During The 2016 Election?’ NPR, 20 March 2018, sec. Politics. https://www.npr.org/2018/03/20/595338116/what-did-cambridge-analytica-do-during-the-2016-election. [7] Witten, I.H., Frank, E., Hall, M.A., Pal, C.J. and DATA, M., 2005. Practical machine learning tools and techniques. In DATA MINING (Vol. 2, p. 4). [8] Géron, A., 2019. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O'Reilly Media. [9] What is Natural Language Processing? An Introduction to NLP [WWW Document], n.d., SearchEnterpriseAI. URL https://searchenterpriseai.techtarget.com/definition/natural-language-processing-NLP) [10] Ott, B. L. (2017). The age of Twitter: Donald J. Trump and the politics of debasement. Critical studies in media communication, 34(1), 59-68. [11] Yaqub, U., Sharma, N., Pabreja, R., Chun, S. A., Atluri, V., & Vaidya, J. (2020). Location-based Sentiment Analyses and Visualization of Twitter Election Data. Digit [12] Baker, S.R., Baksy, A., Bloom, N., Davis, S.J., Rodden, J.A., 2020. Elections, Political Polarization, and Economic Uncertainty, NBER working paper series. National Bureau of Economic Research, Cambridge, Mass. [13] Harris, M.D., 1985. Introduction to natural language processing. Reston Publishing Co. [14] Bitext. ‘What Is the Difference between Stemming and Lemmatization?’ https://blog.bitext.com/what-is-the-difference-between-stemming-and-lemmatization/. [15] Balakrishnan, V. and Lloyd-Yemoh, E., 2014. Stemming and lemmatization: a comparison of retrieval performances. [16] Vallantin, Lima. ‘Why Is Removing Stop Words Not Always a Good Idea’. Medium (blog), 15 June 2020. https://medium.com/@limavallantin/why-is-removing-stop-words-not-always-a-good-idea-c8d35bd77214. [17] Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P., 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, pp.321-357.
dspace.entity.typePublication
relation.isAdvisorOfPublication4dcf8c54-8545-4232-8acf-c163330fd0fe
relation.isAdvisorOfPublicatione2662924-fa9e-477e-9261-d6fbd339d717
relation.isAdvisorOfPublicationd17b0355-2695-449e-b06e-a34f4e27f120
relation.isAdvisorOfPublication.latestForDiscovery4dcf8c54-8545-4232-8acf-c163330fd0fe

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Erol-Mrzic-tfm.pdf
Size:
1.47 MB
Format:
Adobe Portable Document Format