Aviso: para depositar documentos, por favor, inicia sesión e identifícate con tu cuenta de correo institucional de la UCM con el botón MI CUENTA UCM. No emplees la opción AUTENTICACIÓN CON CONTRASEÑA
 

Automatic extraction of ranked SNP-phenotype associations from text using a BERT-LSTM-based method

dc.contributor.authorBokharaeian, Behrouz
dc.contributor.authorDehghani, Mohammad
dc.contributor.authorDíaz Esteban, Alberto
dc.date.accessioned2024-01-10T18:45:32Z
dc.date.available2024-01-10T18:45:32Z
dc.date.issued2023-04-12
dc.description.abstractExtraction of associations of singular nucleotide polymorphism (SNP) and phenotypes from biomedical literature is a vital task in BioNLP. Recently, some methods have been developed to extract mutation-diseases affiliations. However, no accessible method of extracting associations of SNP-phenotype from content considers their degree of certainty. In this paper, several machine learning methods were developed to extract ranked SNP-phenotype associations from biomedical abstracts and then were compared to each other. In addition, shallow machine learning methods, including random forest, logistic regression, and decision tree and two kernel-based methods like subtree and local context, a rule-based and a deep CNN-LSTM-based and two BERT-based methods were developed in this study to extract associations. Furthermore, the experiments indicated that although the used linguist features could be employed to implement a superior association extraction method outperforming the kernel-based counterparts, the used deep learning and BERT-based methods exhibited the best performance. However, the used PubMedBERT-LSTM outperformed the other developed methods among the used methods. Moreover, similar experiments were conducted to estimate the degree of certainty of the extracted association, which can be used to assess the strength of the reported association. The experiments revealed that our proposed PubMedBERT–CNN-LSTM method outperformed the sophisticated methods on the task.
dc.description.departmentDepto. de Ingeniería de Software e Inteligencia Artificial (ISIA)
dc.description.facultyFac. de Informática
dc.description.refereedTRUE
dc.description.statuspub
dc.identifier.doi10.1186/s12859-023-05236-w
dc.identifier.issn1471-2105
dc.identifier.officialurlhttps://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-023-05236-w
dc.identifier.urihttps://hdl.handle.net/20.500.14352/92387
dc.issue.number144
dc.journal.titleBMC bioinformatics
dc.language.isoeng
dc.publisherSpringer Nature
dc.rightsAttribution 4.0 Internationalen
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subject.keywordSNP
dc.subject.keywordPhenotype
dc.subject.keywordBiomedical relation extraction
dc.subject.keywordDegree of certainty classification
dc.subject.ucmInteligencia artificial (Informática)
dc.subject.unesco1203.04 Inteligencia Artificial
dc.titleAutomatic extraction of ranked SNP-phenotype associations from text using a BERT-LSTM-based method
dc.typejournal article
dc.type.hasVersionVoR
dc.volume.number24
dspace.entity.typePublication
relation.isAuthorOfPublication97e9fa87-0f3e-48d8-9832-0abd05ecd9c0
relation.isAuthorOfPublication.latestForDiscovery97e9fa87-0f3e-48d8-9832-0abd05ecd9c0

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s12859-023-05236-w.pdf
Size:
2.46 MB
Format:
Adobe Portable Document Format

Collections