On the capacity of artificial intelligence techniques and statistical methods to deal with low-quality data in medical supply chain environments
Loading...
Download
Official URL
Full text at PDC
Publication date
2024
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
Citation
Santos Arteaga, F. J., Di Caprio, D., Tavana, M., Cucchiari, D., Campistol, J. M., Oppenheimer, F., Diekmann, F., & Revuelta, I. (2024). On the capacity of artificial intelligence techniques and statistical methods to deal with low-quality data in medical supply chain environments. Engineering Applications of Artificial Intelligence, 133. https://doi.org/10.1016/J.ENGAPPAI.2024.108610
Abstract
We illustrate the capacity of Artificial Intelligence (AI) and Machine Learning (ML) techniques to preserve consistent categorization abilities whenever the quality of the data decreases, displaying mistakes or mismatches across matrix entries, while standard statistical methods exhibit significant modifications in the value of the corresponding coefficients. We design algorithms of different complexity to generate a series of comparable profiles. These profiles are compared within environments that allow for an immediate identification of the generating algorithms and within increasingly complex settings involving almost identical profiles derived from different algorithms. AI and ML techniques outperform standard statistical methods when distinguishing the algorithms generating the profiles. Building on these results, we perform a retrospective analysis where AI and ML techniques are applied to two empirical scenarios defined by different data series of patients transplanted through the period 2006–2019. The first scenario contains the variables describing the evolution of patients inputted correctly. In the second, we modify the content of the vectors of characteristics defining the evolution of patients by exchanging the values of a subset of realizations from two categorical variables. AI and ML techniques are consistently accurate when categorizing patients correctly within both scenarios, a feature particularly relevant when the quality of the information sources composing the medical chain varies. This latter problem is exacerbated among hospitals located in developing countries, where the quality of the data gathered limits their identification and extrapolation capacities.