Detección de Contenido Sexual en Audio y Texto mediante Transformers y Aprendizaje Federado en Dispositivos Android
Loading...
Official URL
Full text at PDC
Publication date
2025
Authors
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
El avance tecnológico ha impactado potencialmente a la sociedad en las últimas décadas. Hoy en día, el uso de internet y de dispositivos móviles es cada vez más habitual y precoz. El presente trabajo tiene el objetivo de utilizar tecnologías de aprendizaje automático para detectar los posibles casos de ciberacoso conocido como grooming, cuyas víctimas principales son menores de edad. En este trabajo, se centra en el análisis de mensajes de texto y de audio con contenido sexual, en el que se entrenan diversos modelos con los conjuntos de datos existentes, con el fin de realizar un análisis y estudio detallado. Investigando si existen posibilidades de mejorar la predicción, y posteriormente se integra el modelo adecuado en una aplicación Android para analizar el funcionamiento de los modelos dentro de un dispositivo móvil, teniendo en cuenta que en este contexto la capacidad del cómputo de los dispositivos generalmente es más limitada. Tras la integración, se desarrollan unas funcionalidades en la aplicación como enviar una alerta en los posibles casos positivos de contenido sexual en el análisis de texto y audio. Además, se adopta el aprendizaje federado para permitir que el modelo pueda evolucionar con datos adicionales.
Technological progress has potentially impacted society in recent decades. Nowadays, the use of internet and mobile devices is becoming more and more common and precocious. The present work aims to use machine learning technologies to detect possible cases of cyberbullying known as grooming, whose main victims are minors. This paper focuses on the analysis of text and audio messages with sexual content, in which various models are trained with existing datasets, in order to perform a detailed analysis and study. Investigating whether there are possibilities to improve the prediction, and subsequently integrating the appropriate model into an Android application to analyze the performance of the models within a mobile device, taking into account that in this context the computational capacity of the devices is generally more limited. After integration, functionalities are developed in the application such as sending an alert in possible positive cases of sexual content in text and audio analysis. In addition, federated learning is adopted to allow the model to evolve with additional data.
Technological progress has potentially impacted society in recent decades. Nowadays, the use of internet and mobile devices is becoming more and more common and precocious. The present work aims to use machine learning technologies to detect possible cases of cyberbullying known as grooming, whose main victims are minors. This paper focuses on the analysis of text and audio messages with sexual content, in which various models are trained with existing datasets, in order to perform a detailed analysis and study. Investigating whether there are possibilities to improve the prediction, and subsequently integrating the appropriate model into an Android application to analyze the performance of the models within a mobile device, taking into account that in this context the computational capacity of the devices is generally more limited. After integration, functionalities are developed in the application such as sending an alert in possible positive cases of sexual content in text and audio analysis. In addition, federated learning is adopted to allow the model to evolve with additional data.
Description
Trabajo de Fin de Grado en Ingeniería Informática, Facultad de Informática UCM, Departamento de Ingeniería de Software e Inteligencia Artificial, Curso 2024/2025.