Gutiérrez García-Pardo, InmaculadaGómez González, DanielCastro Cantalejo, JavierBruce BimberJulien Labarre2026-01-122026-01-122024I. Gutiérrez, D. Gómez, J. Castro, B. Bimber and J. Labarre, "Beyond Large Language Models: Rediscovering the Role of Classical Statistics in Modern Data Science," 2024 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Yokohama, Japan, 2024, pp. 1-8, doi: 10.1109/FUZZ-IEEE60900.2024.10611766.1558-473910.1109/FUZZ-IEEE60900.2024.10611766https://hdl.handle.net/20.500.14352/129910This study explores the synergy between large language models and classical statistics in contemporary data science. In the field of large language models, we find there is no one-size-fits-all model which satisfies the needs of other scientists. There are differences in the soft results which may be a limitation on their application. To analyze these differences and lack of robustness, we propose a robust methodology that integrates classical statistical experimental design principles with the these advanced models, aiming to identify statistically significant differences among their outcomes. In particular, an experimental design is presented in which the main factors, levels, treatments and interactions that influence the predictions made by different models of complex natural language processing are identified. The main aim of this research is to better understand the influence of some controlled factors that are used in com-plex natural language processing models by applying classical statistical techniques, providing a comprehensive perspective on the relative effectiveness of different zero-shot classification models. It aims to offer practitioners insights into when and where certain models may be more or less sensitive, facilitating informed decision-making in applying these advanced language models. Additionally, computational results obtained from a pilot dataset are presented. These results illustrate the entire process of the proposed methodology, highlighting the importance of considering statistical evidence when making decisions.engAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/Beyond large language models: rediscovering the role of classical statistics in modern data sciencejournal articlehttps://ieeexplore.ieee.org/document/10611766restricted access51311004519.22-7CienciasMatemáticas (Matemáticas)Estadística aplicada12 Matemáticas1209 Estadística1203.17 Informática