Reconocimiento y análisis de personas en imágenes mediante técnicas de aprendizaje profundo
Loading...
Official URL
Full text at PDC
Publication date
2025
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
La realización de este trabajo presenta dos objetivos fundamentales. En primer lugar, se lleva a cabo una investigación acerca del área de los métodos reconocedores de objetos en imágenes. Dicha investigación consiste en comprender los mecanismos subyacentes a estos métodos, tales como las Redes Neuronales Convolucionales (RNC) y las operaciones matemáticas que las complementan, por ejemplo, el agrupamiento o las funciones de activación.
Esta tarea proporciona los conocimientos necesarios para proceder al entrenamiento de determinados modelos de detección. Para este proyecto se ha determinado que la detección será de personas en imágenes aéreas tomadas por drones en entornos naturales. Como puede observarse, el hecho de que los modelos funcionen correctamente en este ámbito puede ser de gran ayuda en diversas actividades, tales como la localización de personas desaparecidas, la prevención de accidentes o la monitorización de actividades humanas en entornos naturales.
Los modelos estudiados, basados en RNC, forman parte de lo que se denomina modelos de un solo estado, concretamente, la familia YOLO (You Only Look Once). Se lleva a cabo una fase exploratoria de las diferentes variantes, observándose cómo se comportan dichos modelos sobre las imágenes seleccionadas como dataset . Esta fase de investigación permite recabar los datos y resultados suficientes para llegar a la conclusión de cuál es la configuración que mejor permite resolver el problema abordado.
En segundo lugar, se procede al desarrollo de una aplicación que permita a usuarios externos entrenar modelos y comprobar su robustez con algunas imágenes de test. La aplicación pone a disposición del usuario una serie de configuraciones para llevar a cabo el entrenamiento. De esta forma, el usuario puede generar modelos con variaciones y, posteriormente, gracias a la funcionalidad de clasificación, comprobar cuál satisface mejor sus objetivos.
The accomplishment of this work presents two fundamental objectives. Firstly, research will be conducted on the area of object recognition methods in images. This research involves understanding the underlying mechanisms of these methods, such as Convolutional Neural Networks (CNNs) and the mathematical operations that complement them, such as pooling or activation functions. This task provides team members with the necessary knowledge to proceed with the training of specific detection models. For this project, it has been determined that the detection will focus on people in aerial images taken by drones in natural environments. As can be observed, having models function correctly in this context can be highly beneficial for various activities, such as locating missing persons, accident prevention, or monitoring human activities in natural environments. The models to be used are part of what is called single-stage models, specifically from the YOLO (You Only Look Once) family. An exploratory phase of the different variants is carried out, observing how these models behave on the images selected as the dataset. This research phase allows students to gather sufficient data and results to conclude which configuration best solves the addressed problem. Secondly, the development of an application proceeds that allows external users to train models and verify their robustness with some test images. The application provides users with a series of configurations for training. In this way, the user can generate models with variations and subsequently, through the testing functionality on images, verify which one best meets their objectives.
The accomplishment of this work presents two fundamental objectives. Firstly, research will be conducted on the area of object recognition methods in images. This research involves understanding the underlying mechanisms of these methods, such as Convolutional Neural Networks (CNNs) and the mathematical operations that complement them, such as pooling or activation functions. This task provides team members with the necessary knowledge to proceed with the training of specific detection models. For this project, it has been determined that the detection will focus on people in aerial images taken by drones in natural environments. As can be observed, having models function correctly in this context can be highly beneficial for various activities, such as locating missing persons, accident prevention, or monitoring human activities in natural environments. The models to be used are part of what is called single-stage models, specifically from the YOLO (You Only Look Once) family. An exploratory phase of the different variants is carried out, observing how these models behave on the images selected as the dataset. This research phase allows students to gather sufficient data and results to conclude which configuration best solves the addressed problem. Secondly, the development of an application proceeds that allows external users to train models and verify their robustness with some test images. The application provides users with a series of configurations for training. In this way, the user can generate models with variations and subsequently, through the testing functionality on images, verify which one best meets their objectives.
Description
Trabajo de Fin de Grado en Ingeniería Informática e Ingeniería del Software, Facultad Informática UCM, Dpto. de Ingeniería de Software e Inteligencia Artificial (ISIA), Curso 2024/2025.













