RT Generic
T1 Implementación de una API para el análisis de la voz en tiempo real enfocado a Entornos de Realidad Virtual
T2 Implementation of an API for real-time voice analysis focused on Virtual Reality Environments
A1 Martín Gómez, Daniel
AB Este proyecto consiste en el desarrollo de una API que permite mandar audios en tiempo real que son enviados a la biblioteca HumeAI. Esta librería de python permite el envío de frases, urls, imágenes y audios (en los que se centra este trabajo) y realiza el análisis emocional de hasta 51 emociones diferentes de dichos elementos enviados. Posteriormente, se reciben los datos y se analizan con otra librería de Phyton para el software Praat, llamada Parselmouth. Esta permite la detección de diversas características del audio utilizando patrones vocales como el tono de voz, ritmo e intensidad en tiempo real. Esta API permite que un usuario mande los datos a procesar, en este caso la voz, a un servidor, comunicarse con la API en cuestión y devolver el análisis vocal al usuario para que gestione dichos valores vocales y emociones como sean requeridos en su aplicación. Además, se utiliza otra librería llamada Flask cuyo propósito es crear un servidorque permita comunicar las peticiones HTTP entre la API y la aplicación desde la cual se llame, como pudiese ser un proyecto de Unity.
YR 2024
FD 2024
LK https://hdl.handle.net/20.500.14352/110732
UL https://hdl.handle.net/20.500.14352/110732
LA spa
NO This project involves the development of an API that allows sending real-time audio to the 'HumeAI' library. This Python library enables the sending of phrases, URLs, images, and audio (which is the focus of this work) and performs emotional analysis of up to 51 different emotions from the submitted elements. Subsequently, the data is received and analyzed with another Python library for the Praat software, called 'Parselmouth.' This allows the detection of various audio characteristics using vocal patterns such as voice pitch, rhythm, and intensity in real time. This API allows a user to send data for processing, in this case, voice, to a server, communicate with the API, and return the vocal analysis to the user so they can manage the vocal values and emotions as required in their application. Additionally, another library called 'Flask' is used, whose purpose is to create a server that enables HTTP requests between the API and the application from which it is called, such as a 'Unity' project.
DS Docta Complutense
RD 15 feb 2026