Generación de resúmenes de video-entrevistas utilizando redes neuronales

Lozano Hernández, Francisco Javier; Alcázar Muñoz, Daniel

Generación de resúmenes de video-entrevistas utilizando redes neuronales

dc.contributor.advisor	Díaz Esteban, Alberto
dc.contributor.author	Lozano Hernández, Francisco Javier
dc.contributor.author	Alcázar Muñoz, Daniel
dc.date.accessioned	2023-06-16T14:56:53Z
dc.date.available	2023-06-16T14:56:53Z
dc.date.issued	2021-09-21
dc.degree.title	Grado en Ingeniería del Software
dc.description	Trabajo de Fin de Grado en Ingeniería de Software, Facultad de Informática, Departamento de Ingeniería del Software e Inteligencia Artificial, Curso 2020-2021, El código fuente referente a este proyecto se encuentra alojado en: https://github.com/NILGroup/TFG2021RecuerdosVideo
dc.description.abstract	El Alzheimer es una enfermedad neurodegenerativa que produce deterioro cognitivo, pérdida de memoria, así como problemas con el pensamiento y el comportamiento. Actualmente, existen terapias que tienen como objetivo mejorar la calidad de vida de las personas que padecen esta enfermedad. Una de estas terapias es la historia de vida, una técnica narrativa en la que el paciente cuenta su vida al terapeuta en diferentes sesiones en un formato de entrevista semi-estructurada. Aprovechando las nuevas tecnologías es posible grabar en vídeo las diferentes sesiones de las historias de vida para después procesarlas y así servir de ayuda a los terapeutas que tratan a estas personas. Un paso más allá de esta idea, se encuentra el proyecto CANTOR. Este proyecto propone el desarrollo de una herramienta digital integrada capaz de utilizar tecnologías de inteligencia artificial, realizando un apoyo a la terapia ocupacional. De esta forma, este trabajo se centra en la investigación y la creación de un aplicación web capaz de obtener la transcripción de las video-entrevistas y generar el resumen en español utilizando redes neuronales. Para la transcripción, además de la biblioteca SpeechRecognition para separar el audio por silencios, se utiliza la API de SpeechToText de Google para identificar y separar los hablantes. Para la generación de resúmenes usamos un modelo BETO pre-entrenado en resúmenes de noticias cuya implementación se hace utilizando la biblioteca de Hugging Face Transformers. Se expone la experimentación y conclusiones tanto de las diferentes maneras en las que se transcribe un vídeo como de las diferentes formas de pre-procesar la transcripción para generar el resumen.
dc.description.abstract	Alzheimer’s is a neurodegenerative disease that causes cognitive impairment, memory loss, as well as problems with thinking and behaviour. Currently, there are therapies that aim to improve the quality of life of people suffering from this disease. One of these therapies is life story, a narrative technique in which the patient tells the therapist about his or her life in different sessions in a semi-structured interview format. Taking advantage of new technologies, it is possible to video-record the different sessions of the life stories and then process them to help the therapists who treat these people. One step beyond this idea is the CANTOR project. This project proposes the development of an integrated digital tool capable of using Artificial Intelligence technologies to support occupational therapy. In this way, this work focuses on the research and creation of a web application capable of obtaining the transcription of the video-interviews and generating the summary in Spanish using neural networks. For the transcription, in addition to the SpeechRecognition library to separate by silences, the Google SpeechToText API is used to separate by speakers. The BETO model pre-trained on news summaries and implemented thanks to the Hugging Face Transformers library generates the summaries. Experimentation and conclusions are presented both on the different ways in which a video is transcribed and on the different ways of pre-processing the transcript to generate the summary.
dc.description.department	Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)
dc.description.faculty	Fac. de Informática
dc.description.refereed	TRUE
dc.description.status	unpub
dc.eprint.id	https://eprints.ucm.es/id/eprint/68333
dc.identifier.relatedurl	https://github.com/NILGroup/TFG2021RecuerdosVideo
dc.identifier.uri	https://hdl.handle.net/20.500.14352/5354
dc.language.iso	spa
dc.page.total	104
dc.rights	Atribución-NoComercial 3.0 España
dc.rights.accessRights	open access
dc.rights.uri	https://creativecommons.org/licenses/by-nc/3.0/es/
dc.subject.cdu	004(043.3)
dc.subject.keyword	PLN
dc.subject.keyword	Vídeo
dc.subject.keyword	Transcripción
dc.subject.keyword	Generación de resúmenes
dc.subject.keyword	Redes neuronales
dc.subject.keyword	Historia de vida
dc.subject.keyword	Transformer
dc.subject.keyword	BERT
dc.subject.keyword	BETO
dc.subject.keyword	SpeechToText
dc.subject.keyword	NLP
dc.subject.keyword	Video
dc.subject.keyword	Transcription
dc.subject.keyword	Summary generation
dc.subject.keyword	Neural networks
dc.subject.keyword	Life story
dc.subject.ucm	Informática (Informática)
dc.subject.unesco	1203.17 Informática
dc.title	Generación de resúmenes de video-entrevistas utilizando redes neuronales
dc.title.alternative	Generation of video-interview summaries using neural networks
dc.type	bachelor thesis
dspace.entity.type	Publication
relation.isAdvisorOfPublication	97e9fa87-0f3e-48d8-9832-0abd05ecd9c0
relation.isAdvisorOfPublication.latestForDiscovery	97e9fa87-0f3e-48d8-9832-0abd05ecd9c0

Download

Original bundle

Now showing 1 - 1 of 1

Name:: ALCÁZAR MUÑOZ 84971_DANIEL_ALCAZAR_MUNOZ_TFG_Generacion_de_resumenes_de_video-entrevistas_utilizando_redes_neuronales.pdf_1006096_1757555295.pdf
Size:: 2.59 MB
Format:: Adobe Portable Document Format

Download

Collections

Trabajos Fin de Grado (TFG) y Diplomas de Estudios Avanzados (DEA)