Machine Learning interspecific identification of mouse first lower molars (genus Mus Linnaeus, 1758) and application to fossil remains from the Estrecho Cave (Spain)

Research Projects
Organizational Units
Journal Issue
One of the first steps to address palaeontological studies is the taxonomic identification of fossils according to their morphology. Geometric Morphometric techniques together with multivariate statistical analysis are known to be precise tools to achieve this goal. More recent alternative techniques such as Machine Learning are still rarely used in Palaeontology, although it has been shown in various examples that they can offer powerful alternative statistical approaches to analyse quantitative morphometric data. Here we show how Machine Learning applied to two-dimensional geometric morphometric data from the outline shape of the lower first molars of Mus spp. has proven useful to overcome taxonomic problems. We collated a photographic database of 303 lower first molars from modern populations of Mus musculus domesticus and Mus spretus from southwestern Europe and North Africa to compare the performance between classic multivariate statistics and Machine Learning algorithms in identifying the two species from their dental morphology. We also include Late Holocene Mus specimens from the Estrecho Cave (east-central Spain) to predict their specific status. Our results suggest that Machine Learning is more efficient than classical statistical analyses in taxonomic identification of Mus molars, reaching 100% of correct classification. The application of such techniques to fossil material showed that ensemble/stacking algorithms provided robust identification of both M. m. domesticus and M. spretus in the Estrecho Cave assemblage and confirmed that both species colonised the Iberian Peninsula at a time prior to the formation of the site.
UCM subjects
Unesco subjects