RT Journal Article T1 Source attribution of human Campylobacter infection: a multi-country model in the European Union A1 Thystrup, Cecilie A1 Brinch, Maja Lykke A1 Henri, Clementine A1 Mughini-Gras, Lapo A1 Franz, Eelco A1 Wieczorek, Kinga A1 Gutierrez, Montserrat A1 Prendergast, Deirdre M. A1 Duffy, Geraldine A1 Burgess, Catherine M. A1 Bolton, Declan A1 Álvarez Sánchez, Julio A1 López Chavarrías, Vicente A1 Rosendal, Thomas A1 Clemente, Lurdes A1 Amaro, Ana A1 Zomer, Aldert L. A1 Grimstrup Joensen, Katrine A1 Møller Nielsen, Eva A1 Scavia, Gaia A1 Skarżyńska, Magdalena A1 Pinto, Miguel A1 Oleastro, Mónica A1 Cha, Wonhee A1 Thépault, Amandine A1 Rivoal, Katell A1 Denis, Martine A1 Chemaly, Marianne A1 Hald, Tine AB Introduction:Infections caused by Campylobacter spp. represent a severe threat to public health worldwide. National action plans have included source attribution studies as a way to quantify the contribution of specific sources and understand the dynamic of transmission of foodborne pathogens like Salmonella and Campylobacter. Such information is crucial for implementing targeted intervention. The aim of this study was to predict the sources of human campylobacteriosis cases across multiple countries using available whole-genome sequencing (WGS) data and explore the impact of data availability and sample size distribution in a multi-country source attribution model.Methods:We constructed a machine-learning model using k-mer frequency patterns as input data to predict human campylobacteriosis cases per source. We then constructed a multi-country model based on data from all countries. Results using different sampling strategies were compared to assess the impact of unbalanced datasets on the prediction of the cases.Results:The results showed that the variety of sources sampled and the quantity of samples from each source impacted the performance of the model. Most cases were attributed to broilers or cattle for the individual and multi-country models. The proportion of cases that could be attributed with 70% probability to a source decreased when using the down-sampled data set (535 vs. 273 of 2627 cases). The baseline model showed a higher sensitivity compared to the down-sampled model, where samples per source were more evenly distributed. The proportion of cases attributed to non-domestic source was higher but varied depending on the sampling strategy. Both models showed that most cases could be attributed to domestic sources in each country (baseline: 248/273 cases, 91%; down-sampled: 361/535 cases, 67%;).Discussion:The sample sizes per source and the variety of sources included in the model influence the accuracy of the model and consequently the uncertainty of the predicted estimates. The attribution estimates for sources with a high number of samples available tend to be overestimated, whereas the estimates for source with only a few samples tend to be underestimated. Reccomendations for future sampling strategies include to aim for a more balanced sample distribution to improve the overall accuracy and utility of source attribution efforts. PB Frontiers SN 1664-302X YR 2025 FD 2025-02-05 LK https://hdl.handle.net/20.500.14352/117887 UL https://hdl.handle.net/20.500.14352/117887 LA eng NO Thystrup C, Brinch ML, Henri C, Mughini-Gras L, Franz E, Wieczorek K, Gutierrez M, Prendergast DM, Duffy G, Burgess CM, Bolton D, Alvarez J, Lopez-Chavarrias V, Rosendal T, Clemente L, Amaro A, Zomer AL, Grimstrup Joensen K, Nielsen EM, Scavia G, Skarżyńska M, Pinto M, Oleastro M, Cha W, Thépault A, Rivoal K, Denis M, Chemaly M and Hald T (2025) Source attribution of human Campylobacter infection: a multi-country model in the European Union. Front. Microbiol. 16:1519189. doi: 10.3389/fmicb.2025.1519189 NO Author contributionsCT: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing. ML: Conceptualization, Data curation, Formal analysis,Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing. CH: Data curation, Methodology, Software, Validation, Writing – original draft, Writing – review & editing. LM-G: Funding acquisition, Writing – original draft, Writing – review & editing. EF: Writing – original draft, Writing – review & editing. KW: Writing – original draft, Writing – review & editing. MG: Writing – original draft, Writing – review & editing. DP: Writing – original draft, Writing – review & editing. GD: Writing – original draft, Writing – review & editing. CB: Writing – original draft, Writing – review & editing. DB: Writing – original draft, Writing – review & editing. JA: Writing – original draft, Writing – review & editing. VL-C: Writing – original draft, Writing – review & editing. TR: Writing – original draft, Writing – review & editing. LC: Writing – original draft, Writing – review & editing. AA: Writing – original draft, Writing – review & editing. AZ: Writing – original draft, Writing – review & editing. KG: Writing – original draft, Writing – review & editing. EN: Writing – original draft, Writing – review & editing. GS: Writing – original draft, Writing – review & editing. MS: Writing – original draft, Writing – review & editing. MP: Writing – original draft, Writing – review & editing. MO: Writing – original draft, Writing – review & editing. WC: Writing – original draft, Writing – review & editing. AT: Writing – original draft, Writing – review & editing. KR: Writing – original draft, Writing – review & editing. MD: Writing – original draft, Writing – review & editing. MC: Writing – original draft, Writing – review & editing.TH: Conceptualization, Data curation, Funding acquisition, Investigation, Project administration, Supervision, Validation, Writing – original draft, Writing – review & editing. NO Ministerio de Ciencia, Innovación y Universidades (España) NO European Commission DS Docta Complutense RD 7 abr 2025