Accuracy of LLMs to retrieve numeric data for meta-analysis in dentistry

dc.contributor.authorCaponio, Vito Carlo Alberto
dc.contributor.authorLorenzo-Pouso, Alejandro I
dc.contributor.authorMagalhaes, Marco
dc.contributor.authorAli, Aiman
dc.contributor.authorAdamo, Daniela
dc.contributor.authorCirillo, Nicola
dc.contributor.authorLópez-Pintor Muñoz, Rosa María
dc.contributor.authorMusella, Gennaro
dc.date.accessioned2026-01-12T11:57:04Z
dc.date.available2026-01-12T11:57:04Z
dc.date.issued2025-11-19
dc.description.abstractObjectives: Evidence-based dentistry relies heavily on systematic reviews and meta-analyses (SRMA), considered the most robust forms of evidence. Still, conducting SRMA is time- and resource-intensive, with high error rates in data extraction. Artificial intelligence (AI) and large language models (LLMs) offer the potential to automate and accelerate SRMA processes such as data extraction. However, assessing the reliability and accuracy of these new AI-based technologies for SRMA is crucial. This study evaluated the accuracy of four LLMs (DeepSeek v3 R1, Claude 3.5 Sonnet, ChatGPT-4o, and Gemini 2.0-flash) in extracting different primary numeric outcomes data in various dental topics. Methods: LLMs were queried via APIs using default settings and a SMART-format prompt. Descriptive analysis was conducted at sub-outcome, outcome, and study levels. Errors were classified as hallucinations, missed, or omitted data. Results: Overall extraction accuracy was exceptionally high at the sub-outcome level, with only 3 hallucinations (from Gemini 2.0-flash). Total errors increased at the outcome level and study level. Gemini 2.0-flash generally performed significantly worse than others (p < 0.01). Claude 3.5 Sonnet and DeepSeek-v3 R1 generally exhibited superior accuracy and lower omission rates in full-text extraction compared to Gemini 2.0-flash and ChatGPT-4o. Conclusions: This first comparative evaluation of multiple LLMs for data extraction in dental research from full-text PDFs highlights their significant potential but also limitations. Performance varied notably between models, with cost not directly correlating with superior performance. While single data point extraction was highly accurate, errors increased at higher aggregation levels. Standardized outcome reporting in studies could benefit future LLM extraction, and we offer a solid benchmark for future performance comparisons. Clinical significance: This study demonstrates that LLMs can achieve high accuracy in extracting single numeric outcomes, but omission errors in full-text analyses limit their independent use in SRMA. Improving outcome reporting standards and leveraging accurate, lower-cost models may enhance evidence synthesis efficiency in dentistry and beyond.
dc.description.departmentDepto. de Especialidades Clínicas Odontológicas
dc.description.facultyFac. de Odontología
dc.description.refereedTRUE
dc.description.statuspub
dc.identifier.citationCaponio VCA, Lorenzo-Pouso AI, Magalhaes M, Ali A, Adamo D, Cirillo N, López-Pintor RM, Musella G. Accuracy of LLMs to retrieve numeric data for meta-analysis in dentistry. J Dent. 2026 Jan;164:106245. doi: 10.1016/j.jdent.2025.106245
dc.identifier.doi10.1016/j.jdent.2025.106245
dc.identifier.essn1879-176X
dc.identifier.issn0300-5712
dc.identifier.officialurlhttps://doi.org/10.1016/j.jdent.2025.106245
dc.identifier.pmid41265689
dc.identifier.relatedurlhttps://www.sciencedirect.com/science/article/pii/S0300571225006906?via%3Dihub
dc.identifier.relatedurlhttps://pubmed.ncbi.nlm.nih.gov/41265689/
dc.identifier.urihttps://hdl.handle.net/20.500.14352/129877
dc.journal.titleJournal of Dentistry
dc.language.isoeng
dc.page.initial106245
dc.publisherElsevier
dc.rightsAttribution 4.0 Internationalen
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subject.cdu616.314:004.83
dc.subject.keywordArtificial intelligence
dc.subject.keywordData extraction
dc.subject.keywordDentistry
dc.subject.keywordLarge language model
dc.subject.keywordMeta-analysis
dc.subject.keywordSystematic review
dc.subject.ucmOdontología (Odontología)
dc.subject.ucmInteligencia artificial (Informática)
dc.subject.ucmSemiótica
dc.subject.unesco32 Ciencias Médicas
dc.subject.unesco3201 Ciencias Clínicas
dc.subject.unesco1203.17 Informática
dc.subject.unesco6114.18 Comunicación Simbólica
dc.titleAccuracy of LLMs to retrieve numeric data for meta-analysis in dentistry
dc.typejournal article
dc.type.hasVersionVoR
dc.volume.number164
dspace.entity.typePublication
relation.isAuthorOfPublicationb686e7da-b3c7-41a9-bbe0-8c1f30cbc553
relation.isAuthorOfPublication.latestForDiscoveryb686e7da-b3c7-41a9-bbe0-8c1f30cbc553

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
J Dent. 2026 Jan;164:106245.pdf
Size:
3.26 MB
Format:
Adobe Portable Document Format

Collections