Generación de pictogramas con inteligencia artificial para la comunicación aumentativa y alternativa
Loading...
Official URL
Full text at PDC
Publication date
2025
Authors
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
La generación de pictogramas mediante inteligencia artificial (IA) tiene como objetivo principal apoyar a educadores, médicos, psicólogos y otros profesionales que trabajan con personas con dificultades comunicativas. Al permitir la creación instantánea de pictogramas en diversos estilos a partir de descripciones simples, se elimina la dependencia de bases de datos preexistentes, mejorando significativamente el flujo y las posibilidades de comunicación que estos recursos ofrecen. Este Trabajo de Fin de Grado (TFG) se centra en el desarrollo de LoRAs (Low-Rank Adaptations) utilizando tecnologías de software libre, con el propósito de expandir las aplicaciones de la inteligencia artificial generativa de imagen en el ámbito de la Comunicación Aumentativa y Alternativa (CAA). Para este proyecto, se utilizó la herramienta Kohya_ss, que permitió un entrenamiento altamente personalizable de los modelos. Los modelos base seleccionados se fundamentan en Stable Diffusion, específicamente la versión 1.5, por su equilibrio entre calidad, velocidad de generación y bajos requisitos de hardware. Tras numerosos ensayos y ajustes, se lograron entrenar LoRAs que generan pictogramas en dos estilos distintos de manera rápida, coherente y sin necesidad de equipos informáticos de alta gama. Además, se realizó un acercamiento al desarrollo de estos modelos LoRA en tecnologías más potentes como Stable Diffusion XL. Estos avances representan un paso importante hacia una comunicación más inclusiva y eficiente, mejorando la calidad de vida de personas con dificultades comunicativas y facilitando el trabajo de sus cuidadores y profesionales.
The main objective of generating pictograms through AI is to support educators, doctors, psychologists, and other professionals working with individuals with communication difficulties. By allowing the instant creation of pictograms in various styles from simple descriptions, it eliminates the dependency on pre-existing databases, significantly improving the flow and communication possibilities these resources offer. This Bachelor’s Thesis (TFG) focuses on the development of LoRAs (Low-Rank Adaptations) using free software technologies, with the goal of expanding the applications of generative image artificial intelligence (AI) in the field of Augmentative and Alternative Communication (AAC). For this project, the Kohya_ss tool was used, which allowed for highly customizable model training. The selected base models are based on Stable Diffusion, specifically version 1.5, due to its balance between quality, generation speed, and low hardware requirements. After numerous tests and adjustments, LoRAs were successfully trained to generate pictograms in two distinct styles quickly, consistently, and without the need for high-end computing equipment. Additionally, an approach was made to develop these LoRA models using more powerful technologies such as Stable Diffusion XL. These advances represent an important step toward more inclusive and efficient communication, improving the quality of life for individuals with communication difficulties and facilitating the work of their caregivers and professionals.
The main objective of generating pictograms through AI is to support educators, doctors, psychologists, and other professionals working with individuals with communication difficulties. By allowing the instant creation of pictograms in various styles from simple descriptions, it eliminates the dependency on pre-existing databases, significantly improving the flow and communication possibilities these resources offer. This Bachelor’s Thesis (TFG) focuses on the development of LoRAs (Low-Rank Adaptations) using free software technologies, with the goal of expanding the applications of generative image artificial intelligence (AI) in the field of Augmentative and Alternative Communication (AAC). For this project, the Kohya_ss tool was used, which allowed for highly customizable model training. The selected base models are based on Stable Diffusion, specifically version 1.5, due to its balance between quality, generation speed, and low hardware requirements. After numerous tests and adjustments, LoRAs were successfully trained to generate pictograms in two distinct styles quickly, consistently, and without the need for high-end computing equipment. Additionally, an approach was made to develop these LoRA models using more powerful technologies such as Stable Diffusion XL. These advances represent an important step toward more inclusive and efficient communication, improving the quality of life for individuals with communication difficulties and facilitating the work of their caregivers and professionals.
Description
Trabajo de Fin de Grado en Ingeniería Informática, Facultad de Informática UCM, Departamento de Ingeniería del Software e Inteligencia Artificial, Curso 2024/2025.













