XGBoost models based on non imaging features for the prediction of mild cognitive impairment in older adults

Citation

Fernández-Blázquez, M. A., Ruiz-Sánchez de León, J. M., Sanz-Blasco, R., Verche, E., Ávila-Villanueva, M., Gil-Moreno, M. J., Montenegro-Peña, M., Terrón, C., Fernández-García, C., & Gómez-Ramírez, J. (2025). XGBoost models based on non imaging features for the prediction of mild cognitive impairment in older adults. Scientific Reports, 15(1). https://doi.org/10.1038/S41598-025-14832-0

Abstract

The global increase in dementia cases highlights the importance of early detection and intervention, particularly for individuals at risk of mild cognitive impairment (MCI), a precursor to dementia. The aim of this study is to develop and validate machine learning (ML) models based on non-imaging features to predict the risk of MCI conversion in cognitively healthy older adults over a three-year period. Using data from 845 participants aged 65 to 87 years, we built five eXtreme Gradient Boosting (XGBoost) models of increasing complexity, incorporating demographic, self-reported, medical, and cognitive variables. The models were trained and evaluated using robust preprocessing techniques, including multiple imputation for missing data, Synthetic Minority Oversampling Technique (SMOTE) for class balancing, and SHapley Additive exPlanations (SHAP) for interpretability. Model performance improved with the inclusion of cognitive assessments, with the most comprehensive model (Model 5) achieving the highest accuracy (86%) and area under the curve (AUC = 0.8359). Feature importance analysis revealed that variables such as memory tests, depressive symptoms, and age were significant predictors of MCI conversion. In addition, an online risk calculator has been developed and made available free of charge to facilitate clinical use and provide a practical, cost-effective tool for early detection in diverse healthcare settings ( https://aimar-project.shinyapps.io/MCI-risk-calculator/ ). This study highlights the potential of non-imaging ML models for early detection of MCI and emphasizes their accessibility and clinical utility. Future research should focus on validating these models in different populations and examining their integration with personalized intervention strategies to reduce dementia risk.

Research Projects

Organizational Units

Journal Issue

Description

This work was supported by multiple funding sources. The collection of the database used in this study was made possible through funding from the Spanish Ministry of Science, Innovation and Universities (project reference: RTI2018-098762-B-C31) and from the Fundación General de la Universidad de Salamanca (FGUSAL) through the Centro Internacional sobre el Envejecimiento (CENIE) under the Grant 0348_CIE_6_E by Fondos FEDER EU. The data analysis, including the application of machine learning techniques, was carried out within the framework of the project funded by the Spanish Ministry of Science and Innovation through the 2022 Knowl¬edge Generation Projects call (project reference: PID2022-141966OB-I00). Referencias bibliográficas: • Affairs, U. N. D. of E. and S. World Population Ageing 2019. United Nations, Department of Economic and Social Affairs, Population Division (2020). https://doi.org/10.18356/6A8968EF-EN • M.A. Better Alzheimer’s disease facts and figures Alzheimers Dement 8 312 336 • World Health Organization. Global Action Plan on the Public Health Response To Dementia 2017–2025. Geneva: World Health Organization (World Health Organization, 2017). • H. Hampel S. Lista Z.S. Khachaturian Development of biomarkers to chart all Alzheimer’s disease stages: The Royal road to cutting the therapeutic gordian knot Alzheimers Dement. 8 312 336 1:CAS:528:DC%2BC38XpvFCrtbc%3D 22748938 • A.P. Porsteinsson R.S. Isaacson S. Knox M.N. Sabbagh I. Rubino Diagnosis of early Alzheimer’s disease: Clinical practice in 2021 J. Prev. Alzheimers Dis. 8 371 386 1:STN:280:DC%2BB2c7ls1emtA%3D%3D 34101796 12280795 • R. Sanz-Blasco et al. Transition from mild cognitive impairment to normal cognition: Determining the predictors of reversion with multi-state Markov models Alzheimers Dement. 18 1177 1185 34482637 • G. Livingston et al. Dementia prevention, intervention, and care Lancet 390 2673 2734 28735855 • G. Livingston et al. Dementia prevention, intervention, and care: 2024 report of the lancet standing commission Lancet 404 572 628 39096926 • J. Huang et al. Haemoglobin A(1c) and cognitive function in very old, cognitively intact men Age Ageing 41 125 128 21930529 • R. West et al. Better memory functioning associated with higher total and low-density lipoprotein cholesterol levels in very elderly subjects without the Apolipoprotein e4 allele Am. J. Geriatr. Psychiatry 16 781 785 18757771 2614555 • R.K. West et al. Homocysteine and cognitive function in very elderly nondemented subjects Am. J. Geriatr. Psychiatry 19 673 677 21709613 3128431 • Gómez-Ramírez, J., Ávila-Villanueva, M. & Fernández-Blázquez, M. Á. Selecting the most important self-assessed features for predicting conversion to mild cognitive impairment with random forest and permutation-based methods. Sci. Rep. 10, (2020). • Wang, H. et al. Develop a diagnostic tool for dementia using machine learning and non-imaging features. Front. Aging Neurosci.14 (2022). • B.C.M. Stephan T. Kurth F.E. Matthews C. Brayne C. Dufouil Dementia risk prediction in the population: Are screening models accurate? Nat. Rev. Neurol. 6 318 326 20498679 • D.E. Barnes et al. Predicting risk of dementia in older adults: The late-life dementia risk index Neurology 73 173 179 1:STN:280:DC%2BD1MvpsFCrtg%3D%3D 19439724 2715571 • L.G. Exalto et al. Risk score for prediction of 10 year dementia risk in individuals with type 2 diabetes: A cohort study Lancet Diabetes Endocrinol. 1 183 190 24622366 4429783 • Nori, V. S., Hane, C. A., Martin, D. C., Kravetz, A. D. & Sanghavi, D. M. Identifying incident dementia by applying machine learning to a very large administrative claims dataset. PLoS One14 (2019). • Diogo, V. S., Ferreira, H. A. & Prata, D. Early diagnosis of Alzheimer’s disease using machine learning: A multi-diagnostic, generalizable approach. Alzheimers Res. Ther14 (2022). • Fernández-Blázquez, M. A. et al. MADRID + 90 study on factors associated with longevity: Study design and preliminary data. PLoS One16 (2021). • M.S. Albert et al. The diagnosis of mild cognitive impairment due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s association workgroups on diagnostic guidelines for Alzheimer’s disease Alzheimers Dement. 7 270 279 21514249 • Chen, T. & Guestrin, C. XGBoost: A scalable tree boosting system, In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Dis 13–17-August-2016785–794 (2016). • J. Snoek H. Larochelle R.P. Adams Practical bayesian optimization of machine learning algorithms Adv. Neural Inf. Process. Syst. 4 2951 2959 • M. Kuhn Building predictive models in R using the caret package J. Stat. Softw. 28 1 26 • Robin, X. et al. pROC: An open-source package for R and S + to analyze and compare ROC curves. BMC Bioinform.12 (2011). • Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016). • S. van Buuren K. Groothuis-Oudshoorn Mice: Multivariate imputation by chained equations in R J. Stat. Softw. 45 1 67 • N. Lunardon G. Menardi N. Torelli ROSE: A package for binary imbalanced learning R J. 6 79 89 • C. Bentéjac A. Csörgő G. Martínez-Muñoz A comparative analysis of gradient boosting algorithms Artif. Intell. Rev. 54 1937 1967 • Lundberg, S. M. & Lee, S. I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 4766–4775 (2017). • Yang, L. & Just, A. SHAPforxgboost: SHAP Plots for ‘XGBoost’ R package version 0.1.3 Preprint at (2023). https://CRAN.R-project.org/package=SHAPforxgboost • M.A. Fernández-Blázquez M. Ávila-Villanueva M. Medina The dimensional structure of subjective cognitive decline Neuromethods 137 45 62 • J. Yesavage et al. Development and validation of a geriatric depression screening scale: A preliminary report J. Psychiatr Res. 17 37 49 1:STN:280:DyaL3s3ktFentA%3D%3D • R. Rabin F. De Charro EQ-5D: A measure of health status from the EuroQol group Ann. Med. 33 337 343 1:STN:280:DC%2BD3MvksFWqsw%3D%3D 11491192 • M.F. Folstein S.E. Folstein P.R. McHugh Mini-mental state: A practical method for grading the cognitive state of patients for the clinician J. Psychiatr Res. 12 189 198 1:STN:280:DyaE28%2FntFKjtw%3D%3D 1202204 • H. Buschke Cued recall in amnesia J. Clin. Neuropsychol. 6 433 440 1:STN:280:DyaL2M%2FlsFGgug%3D%3D 6501581 • A. Rey L’examen psychologique Dans les Cas d’encéphalopathie traumatique Arch. Psychol. (Geneve) 28 215 285 • Wechsler, D. Wechsler Adult Intelligence Scale-III (The Psychological Corporation, 1997). • S. Belleville C. Fouquet C. Hudon H.T.V. Zomahoun J. Croteau Neuropsychological measures that predict progression from mild cognitive impairment to Alzheimer’s type dementia in older adults: A systematic review and meta-analysis Neuropsychol. Rev. 27 328 353 29019061 5754432 • J.J. Gomar M.T. Bobes-Bascaran C. Conejero-Goldberg P. Davies T.E. Goldberg Utility of combinations of biomarkers, cognitive markers, and risk factors to predict conversion from mild cognitive impairment to Alzheimer disease in patients in the Alzheimer’s disease neuroimaging initiative Arch. Gen. Psychiatry 68 961 969 21893661

Keywords

Collections