Gómez Silva, María José2025-01-232025-01-232021Gómez-Silva, M.J. Deep multi-shot network for modelling appearance similarity in multi-person tracking applications. Multimed Tools Appl 80, 23701–23721 (2021). https://doi.org/10.1007/s11042-020-10256-21380-750110.1007/s11042-020-10256-2https://hdl.handle.net/20.500.14352/115731"“This version of the article has been accepted for publication, after peer review but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at:https://doi.org/10.1007/S11042-020-10256-2"The automatization of Multi-Object Tracking becomes a demanding task in real unconstrained scenarios, where the algorithms have to deal with crowds, crossing people, occlusions, disappearances and the presence of visually similar individuals. In those circumstances, the data association between the incoming detections and their corresponding identities could miss some tracks or produce identity switches. In order to reduce these tracking errors, and even their propagation in further frames, this article presents a Deep Multi-Shot neural model for measuring the Degree of Appearance Similarity (MS-DoAS) between person observations. This model provides temporal consistency to the individuals’ appearance representation, and provides an affinity metric to perform frame-by-frame data association, allowing online tracking. The model has been deliberately trained to be able to manage the presence of previous identity switches and missed observations in the handled tracks. With that purpose, a novel data generation tool has been designed to create training tracklets that simulate such situations. The model has demonstrated a high capacity to discern whether a new observation corresponds to a certain track or not, achieving a classification accuracy of 97% in a hard test that simulates tracks with previous mistakes. Moreover, the tracking efficiency of the model in a Surveillance application has been demonstrated by integrating that into the frame-by-frame association of a Tracking-by-Detection algorithm.engAttribution-NonCommercial-ShareAlike 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-sa/4.0/Deep multi-shot network for modelling appearance similarity in multi-person tracking applicationsjournal article1573-7721https://doi.org/10.1007/s11042-020-10256-2https://arxiv.org/pdf/2004.03531https://link.springer.com/article/10.1007/s11042-020-10256-2open access519.7Deep neural networkAppearance similarityMulti-shot recognitionMulti-object trackingInteligencia artificial (Informática)1203.04 Inteligencia Artificial