Aviso: para depositar documentos, por favor, inicia sesión e identifícate con tu cuenta de correo institucional de la UCM con el botón MI CUENTA UCM. No emplees la opción AUTENTICACIÓN CON CONTRASEÑA
 

Predicting haplogroups using a versatile machine learning program (PredYMaLe) on a new mutationally balanced 32 Y-STR multiplex (CombYplex): unlocking the full potential of the human STR mutation rate spectrum to estimate forensic parameters

dc.contributor.authorBouakaze, Caroline
dc.contributor.authorDelehelle, Franklin
dc.contributor.authorSaenz-Oyhéréguy, Nancy
dc.contributor.authorMoreira, Andreia
dc.contributor.authorSchiavinato, Stéphanie
dc.contributor.authorCroze, Myriam
dc.contributor.authorDelon, Solène
dc.contributor.authorFortes-Lima, César
dc.contributor.authorGibert, Morgane
dc.contributor.authorBujan, Louis
dc.contributor.authorHuyghe, Eric
dc.contributor.authorBellis, Gil
dc.contributor.authorCalderón Fernández, María Del Rosario
dc.contributor.authorHernández, Candela
dc.contributor.authorAvendaño-Tamayo, Efren
dc.contributor.authorBedoya, Gabriel
dc.contributor.authorSalas, Antonio
dc.contributor.authorMazières, Stéphane
dc.contributor.authorChiaroni, Jacques
dc.contributor.authorMigot-Nabias, Florence
dc.contributor.authorRuiz-Linares, Andres
dc.contributor.authorDugoujon, Jean-Michel
dc.contributor.authorThéves, Catherine
dc.contributor.authorMollereau-Manaute, Catherine
dc.contributor.authorNôus, Camille
dc.contributor.authorPoulet, Nicolas
dc.contributor.authorKing, Turi
dc.contributor.authorD'Amato, Maria Eugenia
dc.contributor.authorBalaresque, Patricia
dc.date.accessioned2024-02-02T08:35:02Z
dc.date.available2024-02-02T08:35:02Z
dc.date.issued2020
dc.description.abstractWe developed a new mutationally well-balanced 32 Y-STR multiplex (CombYplex) together with a machine learning (ML) program PredYMaLe to assess the impact of STR mutability on haplogourp prediction, while respecting forensic community criteria (high DC/HD). We designed CombYplex around two sub-panels M1 and M2 characterized by average and high-mutation STR panels. Using these two sub-panels, we tested how our program PredYmale reacts to mutability when considering basal branches and, moving down, terminal branches. We tested first the discrimination capacity of CombYplex on 996 human samples using various forensic and statistical parameters and showed that its resolution is sufficient to separate haplogroup classes. In parallel, PredYMaLe was designed and used to test whether a ML approach can predict haplogroup classes fromY-STR profiles. Applied to our kit, SVM and Random Forest classifiers perform very well (average 97%), better than Neural Network (average 91%) and Bayesian methods (<90%). We observe heterogeneity in haplogroup assignation accuracy among classes, with most haplogroups having high prediction scores (99-100%) and two (E1b1b and G) having lower scores (67%). The small sample sizes of these classes explain the high tendency to misclassify the Y-profiles of these haplogroups; results were measurably improved as soon as more training data were added. We provide evidence that our ML approach is a robust method to accurately predict haplogroups when it is combined with a sufficient number of markers, well-balanced mutation rate Y-STR panels, and large ML training sets. Further research on confounding factors (such as gene conversion) and ideal STR panels in regard to the branches analysed can be developed to help classifiers further optimize prediction scores.
dc.description.departmentDepto. de Biodiversidad, Ecología y Evolución
dc.description.facultyFac. de Ciencias Biológicas
dc.description.refereedTRUE
dc.description.sponsorshipUniversity Toulouse III
dc.description.sponsorshipMinisterio de Economía y Competitividad (España)
dc.description.sponsorshipObservatory Man-Environment Haut-Vicdessos (France)
dc.description.sponsorshipNational Research Foundation
dc.description.statuspub
dc.identifier.citationBouakaze C, Delehelle F, Saenz-Oyhéréguy N, Moreira A, Schiavinato S, Croze M, Delon S, Fortes-Lima C, Gibert M, Bujan L, Huyghe E, Bellis G, Calderon R, Hernández CL, Avendaño-Tamayo E, Bedoya G, Salas A, Mazières S, Charioni J, Migot-Nabias F, Ruiz-Linares A, Dugoujon JM, Thèves C, Mollereau-Manaute C, Noûs C, Poulet N, King T, D'Amato ME, Balaresque P. Predicting haplogroups using a versatile machine learning program (PredYMaLe) on a new mutationally balanced 32 Y-STR multiplex (CombYplex): Unlocking the full potential of the human STR mutation rate spectrum to estimate forensic parameters. Forensic Sci Int Genet. 2020 Sep;48:102342.
dc.identifier.doi10.1016/j.fsigen.2020.102342
dc.identifier.issn1872-4973
dc.identifier.officialurlhttps://www.doi.org/10.1016/j.fsigen.2020.102342
dc.identifier.relatedurlhttps://pubmed.ncbi.nlm.nih.gov/32818722/
dc.identifier.urihttps://hdl.handle.net/20.500.14352/98074
dc.journal.titleForensic Science International: Genetics
dc.language.isoeng
dc.page.initial102342
dc.publisherScience Direct
dc.rights.accessRightsrestricted access
dc.subject.keywordY-STR
dc.subject.keywordMachine learning
dc.subject.keywordAssignation accuracy and haplogroup prediction (Hg prediction)
dc.subject.keywordIncremental mutation rates
dc.subject.ucmBiología
dc.subject.unesco2402 Antropología (Física)
dc.titlePredicting haplogroups using a versatile machine learning program (PredYMaLe) on a new mutationally balanced 32 Y-STR multiplex (CombYplex): unlocking the full potential of the human STR mutation rate spectrum to estimate forensic parameters
dc.typejournal article
dc.type.hasVersionVoR
dc.volume.number48
dspace.entity.typePublication
relation.isAuthorOfPublication3dacca8b-b0ef-4c31-b987-3371b2ac9125
relation.isAuthorOfPublication.latestForDiscovery3dacca8b-b0ef-4c31-b987-3371b2ac9125

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Predicting_haplogroups.pdf
Size:
6.6 MB
Format:
Adobe Portable Document Format

Collections