Customization of the text-to-image diffusion model by fine-tuning for the generation of synthetic images of cyanobacterial blooms in lentic water bodies

dc.contributor.authorBarrientos-Espillco, Fredy
dc.contributor.authorPajares Martínsanz, Gonzalo
dc.contributor.authorLópez Orozco, José Antonio
dc.contributor.authorBesada Portas, Eva
dc.date.accessioned2025-07-10T13:48:57Z
dc.date.available2025-07-10T13:48:57Z
dc.date.issued2025-08
dc.descriptionThis work has been supported by the Research Projects IA-GESFig. 8. Visual presentation of experimental results obtained using the dual-task CNN model, trained independently on three different datasets: Barrientos-Espillco et al. (2024), custom Stable Diffusion XL, and a combination of both. In the first row, the results of the model trained exclusively on the Barrientos-Espillco et al. (2024) dataset are shown. The second row presents the results of the model trained with synthetic images generated by Stable Diffusion XL. In the third row, the results of the model trained using the combination of both datasets are illustrated. F. Barrientos-Espillco et al. Expert Systems With Applications 287 (2025) 128169 14 BLOOM-CM (Y2020/TCS-6420) of the Synergic program of the Comunidad Autonoma ´ de Madrid, SMART-BLOOMS (TED2021-130123B-I00) funded by the Spanish Ministry of Science and Innovation and the European Union NextGeneration, and INSERTION (PID2021-127648OBC33) of the Knowledge Generation Programs of the Spanish Ministry of Science and Innovation. The first author, Fredy Barrientos-Espillco, is supported by a scholarship by PRONABEC, Ministry of Education of Peru
dc.description.abstractCyanobacterial blooms emerge unpredictably on the surface of lentic water bodies, posing both ecological threats and public health risks. To effectively monitor these events, this study introduces the use of Machine Vision Systems (MVS) integrated into Autonomous Surface Vehicles (ASVs). These ASVs are capable of autonomous and safe navigation, enabling them to detect cyanobacterial blooms while avoiding obstacles. Convolutional Neural Networks (CNNs) are employed for early detection and continuous monitoring, but their effectiveness hinges on access to large, high-quality training datasets. Due to the sporadic and uncontrollable nature of bloom occurrences, acquiring sufficient real-world images for training and validating CNN models is a significant challenge. To overcome this, the Stable Diffusion XL (SDXL) text-to-image generative model is utilized to produce realistic synthetic images, ensuring a sufficient dataset for training. However, SDXL alone struggles to accurately depict cyanobacterial blooms. To address this limitation, DreamBooth is used to fine-tune SDXL with a small set of real bloom-specific image patches. To ensure the diversity of the synthetic dataset, detailed prompts for SDXL are generated using a Large Language Model (LLM). The combination of SDXL fine-tuning with LLM-driven prompts design applied to environmental monitoring and autonomous navigation in lentic environments represents the core innovation of this work. A dual-task CNN model is then trained on the synthetic dataset to simultaneously detect blooms and obstacles. Experimental results demonstrate the effectiveness and novelty of the proposed approach, showing improvements of up to 15.74% in object detection and 6.48% in semantic segmentation compared to the baseline dataset.
dc.description.departmentDepto. de Arquitectura de Computadores y Automática
dc.description.facultyFac. de Informática
dc.description.refereedTRUE
dc.description.sponsorshipComunidad de Madrid
dc.description.sponsorshipMinisterio de Ciencia e Innovación (España)
dc.description.sponsorshipMinisterio de Educación (Perú)
dc.description.statuspub
dc.identifier.citationBarrientos-Espillco, F., Pajares, G., López-Orozco, J. A., & Besada-Portas, E. (2025). Customization of the text-to-image diffusion model by fine-tuning for the generation of synthetic images of cyanobacterial blooms in lentic water bodies. Expert Systems with Applications, 128169.
dc.identifier.doi10.1016/j.eswa.2025.128169
dc.identifier.issn0957-4174
dc.identifier.officialurlhttps://doi.org/10.1016/j.eswa.2025.128169
dc.identifier.urihttps://hdl.handle.net/20.500.14352/122408
dc.issue.number128169
dc.journal.titleExpert Systems with Applications
dc.language.isoeng
dc.publisherElsevier
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-127648OB-C33/ES/COOPERACION DE VEHICULOS DE SUPERFICIE Y AEREOS PARA APLICACIONES DE INSPECCION EN ENTORNOS CAMBIANTES/
dc.rightsAttribution 4.0 Internationalen
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subject.keywordArtificial intelligence
dc.subject.keywordMachine vision systems
dc.subject.keywordText-to-image generation
dc.subject.keywordFine-tuning
dc.subject.keywordLarge language model
dc.subject.keywordConvolutional neural networks
dc.subject.keywordAutonomous surface vehicles
dc.subject.keywordCyanobacterial blooms
dc.subject.ucmInteligencia artificial (Informática)
dc.subject.ucmSoftware
dc.subject.ucmLenguajes de programación
dc.subject.ucmSistemas expertos
dc.subject.unesco1203.04 Inteligencia Artificial
dc.subject.unesco1203.17 Informática
dc.titleCustomization of the text-to-image diffusion model by fine-tuning for the generation of synthetic images of cyanobacterial blooms in lentic water bodies
dc.typejournal article
dc.type.hasVersionVoR
dc.volume.number287
dspace.entity.typePublication
relation.isAuthorOfPublication878e090e-a59f-4f17-b5a2-7746bed14484
relation.isAuthorOfPublication26b95994-f79c-4d7c-8de5-a003d6d2a770
relation.isAuthorOfPublication0acc96fe-6132-45c5-ad71-299c9dcb6682
relation.isAuthorOfPublication.latestForDiscovery878e090e-a59f-4f17-b5a2-7746bed14484

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Customization_text_to_image_diffusion_model.pdf
Size:
12.64 MB
Format:
Adobe Portable Document Format

Collections