RT Journal Article
T1 Customization of the text-to-image diffusion model by fine-tuning for the generation of synthetic images of cyanobacterial blooms in lentic water bodies
A1 Barrientos-Espillco, Fredy
A1 Pajares Martínsanz, Gonzalo
A1 López Orozco, José Antonio
A1 Besada Portas, Eva
AB Cyanobacterial blooms emerge unpredictably on the surface of lentic water bodies, posing both ecological threats and public health risks. To effectively monitor these events, this study introduces the use of Machine Vision Systems (MVS) integrated into Autonomous Surface Vehicles (ASVs). These ASVs are capable of autonomous and safe navigation, enabling them to detect cyanobacterial blooms while avoiding obstacles. Convolutional Neural Networks (CNNs) are employed for early detection and continuous monitoring, but their effectiveness hinges on access to large, high-quality training datasets. Due to the sporadic and uncontrollable nature of bloom occurrences, acquiring sufficient real-world images for training and validating CNN models is a significant challenge. To overcome this, the Stable Diffusion XL (SDXL) text-to-image generative model is utilized to produce realistic synthetic images, ensuring a sufficient dataset for training. However, SDXL alone struggles to accurately depict cyanobacterial blooms. To address this limitation, DreamBooth is used to fine-tune SDXL with a small set of real bloom-specific image patches. To ensure the diversity of the synthetic dataset, detailed prompts for SDXL are generated using a Large Language Model (LLM). The combination of SDXL fine-tuning with LLM-driven prompts design applied to environmental monitoring and autonomous navigation in lentic environments represents the core innovation of this work. A dual-task CNN model is then trained on the synthetic dataset to simultaneously detect blooms and obstacles. Experimental results demonstrate the effectiveness and novelty of the proposed approach, showing improvements of up to 15.74% in object detection and 6.48% in semantic segmentation compared to the baseline dataset.
PB Elsevier
SN 0957-4174
YR 2025
FD 2025-08
LK https://hdl.handle.net/20.500.14352/122408
UL https://hdl.handle.net/20.500.14352/122408
LA eng
NO Barrientos-Espillco, F., Pajares, G., López-Orozco, J. A., & Besada-Portas, E. (2025). Customization of the text-to-image diffusion model by fine-tuning for the generation of synthetic images of cyanobacterial blooms in lentic water bodies. Expert Systems with Applications, 128169.
NO This work has been supported by the Research Projects IA-GESFig. 8. Visual presentation of experimental results obtained using the dual-task CNN model, trained independently on three different datasets: Barrientos-Espillco et al. (2024), custom Stable Diffusion XL, and a combination of both. In the first row, the results of the model trained exclusively on the Barrientos-Espillco et al. (2024) dataset are shown. The second row presents the results of the model trained with synthetic images generated by Stable Diffusion XL. In the third row, the results of the model trained using the combination of both datasets are illustrated. F. Barrientos-Espillco et al. Expert Systems With Applications 287 (2025) 128169 14 BLOOM-CM (Y2020/TCS-6420) of the Synergic program of the Comunidad Autonoma ´ de Madrid, SMART-BLOOMS (TED2021-130123B-I00) funded by the Spanish Ministry of Science and Innovation and the European Union NextGeneration, and INSERTION (PID2021-127648OBC33) of the Knowledge Generation Programs of the Spanish Ministry of Science and Innovation. The first author, Fredy Barrientos-Espillco, is supported by a scholarship by PRONABEC, Ministry of Education of Peru
NO Comunidad de Madrid
NO Ministerio de Ciencia e Innovación (España) 
NO Ministerio de Educación (Perú)
DS Docta Complutense
RD 23 abr 2026