Aviso: para depositar documentos, por favor, inicia sesión e identifícate con tu cuenta de correo institucional de la UCM con el botón MI CUENTA UCM. No emplees la opción AUTENTICACIÓN CON CONTRASEÑA
 

Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures

dc.contributor.authorHerrero, José Ramón
dc.contributor.authorQuintana-Ortí, Enrique S.
dc.contributor.authorCatalán Pallarés, Sandra
dc.contributor.authorIgual Peña, Francisco Daniel
dc.contributor.authorRodríguez Sánchez, Rafael
dc.date.accessioned2025-01-21T12:23:02Z
dc.date.available2025-01-21T12:23:02Z
dc.date.issued2023-01-19
dc.description.abstractWe propose a methodology to address the programmability issues derived from the emergence of new-generation shared-memory NUMA architectures. For this purpose, we employ dense matrix factorizations and matrix inversion (DMFI) as a use case, and we target two modern architectures (AMD Rome and Huawei Kunpeng 920) that exhibit configurable NUMA topologies. Our methodology pursues performance portability across different NUMA configurations by proposing multi-domain implementations for DMFI plus a hybrid task- and loop-level parallelization that configures multi-threaded executions to fix core-to-data binding, exploiting locality at the expense of minor code modifications. In addition, we introduce a generalization of the multi-domain implementations for DMFI that offers support for virtually any NUMA topology in present and future architectures. Our experimentation on the two target architectures for three representative dense linear algebra operations validates the proposal, reveals insights on the necessity of adapting both the codes and their execution to improve data access locality, and reports performance across architectures and inter- and intra-socket NUMA configurations competitive with state-of-the-art message-passing implementations, maintaining the ease of development usually associated with shared-memory programming.
dc.description.departmentSección Deptal. de Arquitectura de Computadores y Automática (Físicas)
dc.description.facultyFac. de Informática
dc.description.refereedTRUE
dc.description.statuspub
dc.identifier.citationSandra Catalán, Francisco D. Igual, José R. Herrero, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí, Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures, Journal of Parallel and Distributed Computing, Volume 175, 2023, Pages 51-65, ISSN 0743-7315, https://doi.org/10.1016/j.jpdc.2023.01.004.
dc.identifier.doi10.1016/j.jpdc.2023.01.004
dc.identifier.officialurlhttps://www.sciencedirect.com/science/article/pii/S0743731523000047?via%3Dihub
dc.identifier.urihttps://hdl.handle.net/20.500.14352/115348
dc.journal.titleJournal of Parallel and Distributed Computing
dc.language.isoeng
dc.page.final65
dc.page.initial51
dc.publisherElsevier
dc.rights.accessRightsopen access
dc.subject.keywordNUMA architectures
dc.subject.keywordChiplets
dc.subject.keywordDense linear algebra
dc.subject.keywordShared-memory programming
dc.subject.keywordPortability
dc.subject.ucmSoftware
dc.subject.unesco33 Ciencias Tecnológicas
dc.titleProgramming parallel dense matrix factorizations and inversion for new-generation NUMA architectures
dc.typejournal article
dc.type.hasVersionAM
dc.volume.number175
dspace.entity.typePublication
relation.isAuthorOfPublication9c042df5-5a71-4088-a155-194f339a226e
relation.isAuthorOfPublicatione1ed9960-37d5-4817-8e5c-4e0e392b4d66
relation.isAuthorOfPublication02e9ebb2-af1f-451a-a819-47cb4e4ce515
relation.isAuthorOfPublication.latestForDiscovery9c042df5-5a71-4088-a155-194f339a226e

Download

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2022_NUMA_DLA (26).pdf
Size:
397.08 KB
Format:
Adobe Portable Document Format

Collections