Analysis of the Transformer Architecture and application on a Large Language Model for mental health counseling

Herencia López-Menchero, Andrés

Analysis of the Transformer Architecture and application on a Large Language Model for mental health counseling

Download

andres_herencia_TFM_TECI.pdf (6.81 MB)

Publication date

2024

Defense date

06/2024

Authors

Herencia López-Menchero, Andrés

Advisors (or tutors)

Vega Barbas, Mario

Villanueva Díez, Ignacio

Citations

Exportar

URI

https://hdl.handle.net/20.500.14352/106894

Abstract

The rapid advances of generative Artificial Intelligence (AI) have marked a milestone in the Natural Language Processing (NLP) field. Specifically, Transformer models have revolutionized the state-of-the-art due to their great effectiveness and efficiency in several tasks, both general and specific. Thus, this work explores the Transformer architecture and its application to Large Language Models (LLMs) through Parameter-Efficient Fine Tunning (PEFT) techniques, aiming to create a conversational model for mental health counseling. The study provides a detailed explanation of the architecture, including its theoretical and mathematical foundations. The fine-tuning process uses a novel state-ofthe-art technique known as Low Rank Adapters (LoRA). Subsequently, the performance is evaluated by comparing the original with the fine-tuned model to verify the adaptation performance. Some conclusions are extracted at the end of the document, highlighting the most important advantages and disadvantages of the applied methodology.
Los rápidos avances en Inteligencia Artificial (IA) generativa ha supuesto un hito en el campo del Procesamiento del Lenguaje Natural (PLN). Concretamente, los modelos Transformer han revolucionado el estado del arte a través de su gran eficacia y eficiencia en gran variedad de tareas, tanto generales como específicas. Así, este trabajo explora la arquitectura Transformador y su aplicación a Modelos del Lenguaje Grandes (MLG) a través de técnicas de sobre-entrenamiento, para crear una herramienta que sirva para asesorar y aconsejar en el ámbito de la salud mental. El estudio explica detalladamente la arquitectura, incluyendo los fundamentos teóricos y matemáticos. El proceso de sobreentrenamiento hace uso de una técnica novedosa en el estado del arte, conocida como LoRA (Low Rank Adapters). Posteriormente, se evalúa el rendimiento de este modelo con respecto al original, comprobando la efectividad de la adaptación. Algunas conclusiones son extraídas al final del documento, destacando las ventajas e inconvenientes de la metodología aplicada, así como futuras posibles líneas de investigación del proyecto.

UCM subjects

Inteligencia artificial (Informática)

Unesco subjects

1203.04 Inteligencia Artificial

Collections

Trabajos Fin de Master (TFM)

Full item page

Analysis of the Transformer Architecture and application on a Large Language Model for mental health counseling

Download

Official URL

Full text at PDC

Publication date

Defense date

Authors

Advisors (or tutors)

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Citations

Exportar

URI

Citation

Abstract

Research Projects

Organizational Units

Journal Issue

Description

UCM subjects

Unesco subjects

Keywords

Collections