RT Journal Article T1 Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM A1 Alaejos, Guillermo A1 Castelló, Adrián A1 Alonso-Jordá, Pedro A1 Martínez, Héctor A1 Quintana-Ortí, Enrique S. A1 Igual Peña, Francisco Daniel AB We explore the utilization of the Apache TVM open source framework to automatically generate a family of algorithms that follow the approach taken by popular linear algebra libraries, such as GotoBLAS2, BLIS, and OpenBLAS, to obtain high-performance blocked formulations of the general matrix multiplication (gemm). In addition, we fully automatize the generation process by also leveraging the Apache TVM framework to derive a complete variety of the processor-specific micro-kernels for gemm. This is in contrast with the convention in high-performance libraries, which hand-encode a single micro-kernel per architecture using Assembly code. In global, the combination of our TVM-generated blocked algorithms and micro-kernels for gemm (1) improves portability, maintainability, and, globally, streamlines the software life cycle; (2) provides high flexibility to easily tailor and optimize the solution to different data types, processor architectures, and matrix operand shapes, yielding performance on a par (or even superior for specific matrix shapes) with that of hand-tuned libraries; and (3) features a small memory footprint. PB Association for Computing Machinery YR 2024 FD 2024-03-16 LK https://hdl.handle.net/20.500.14352/115351 UL https://hdl.handle.net/20.500.14352/115351 LA eng NO Guillermo Alaejos, Adrián Castelló, Pedro Alonso-Jordá, Francisco D. Igual, Héctor Martínez, and Enrique S. Quintana-Ortí. 2024. Algorithm 1039: Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM. ACM Trans. Math. Softw. 50, 1, Article 6 (March 2024), 34 pages. https://doi.org/10.1145/3638532 DS Docta Complutense RD 5 abr 2025