Pineda San Juan, SilviaPerpiñán Pérez, Rocío2025-11-102025-11-102025https://hdl.handle.net/20.500.14352/125919This work focuses on the development and evaluation of predictive models to estimate the risk of myocardial infarction in the adult Spanish population, using microdata from the 2017 Spanish National Health Survey. The main objective is to identify individuals at higher cardiovascular risk through clinical, demographic, and lifestyle variables, in order to contribute to early detection and more effective medical interventions. Given the significant class imbalance in the dataset (low prevalence of infarction cases), resampling techniques were applied using the SMOTE algorithm in 50-50 and 60-40 configurations, following stratification by sex. Predictive models were then built using LASSO regression and Random Forest (RF), with performance assessed using metrics such as sensitivity, specificity, and the area under the ROC curve (AUC). The resulting models showed good predictive performance, and a mixed factorial analysis helped uncover relevant patterns among variables. The findings of this study may support clinical decision-making and promote more effective prevention strategies for cardiovascular diseases.spaAplicación de técnicas de machine learning y oversampling para la predicción del riesgo cardiovascular en adultos en Españabachelor thesisopen access519.2519.22-7614616.12Cardiovascular riskMyocardial infarctionMachine learningSMOTERandom ForestLASSO regressionPublic healthEstadísticaEstadística aplicadaCardiologíaSalud pública (Medicina)1209 Estadística1209.03 Análisis de Datos3205.01 Cardiología3212 Salud Publica