Aplicación de técnicas de machine learning y oversampling para la predicción del riesgo cardiovascular en adultos en España

Loading...
Thumbnail Image

Official URL

Full text at PDC

Publication date

2025

Defense date

09/2025

Advisors (or tutors)

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Citations
Google Scholar

Citation

Abstract

This work focuses on the development and evaluation of predictive models to estimate the risk of myocardial infarction in the adult Spanish population, using microdata from the 2017 Spanish National Health Survey. The main objective is to identify individuals at higher cardiovascular risk through clinical, demographic, and lifestyle variables, in order to contribute to early detection and more effective medical interventions. Given the significant class imbalance in the dataset (low prevalence of infarction cases), resampling techniques were applied using the SMOTE algorithm in 50-50 and 60-40 configurations, following stratification by sex. Predictive models were then built using LASSO regression and Random Forest (RF), with performance assessed using metrics such as sensitivity, specificity, and the area under the ROC curve (AUC). The resulting models showed good predictive performance, and a mixed factorial analysis helped uncover relevant patterns among variables. The findings of this study may support clinical decision-making and promote more effective prevention strategies for cardiovascular diseases.

Research Projects

Organizational Units

Journal Issue

Description

Keywords