Exploiting Elasticity via OS-Runtime Cooperation to Improve CPU Utilization in Multicore Systems
Loading...
Download
Official URL
Full text at PDC
Publication date
2024
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
J. Rubio, C. Bilbao, J. C. Saez and M. Prieto-Matias, "Exploiting Elasticity via OS-Runtime Cooperation to Improve CPU Utilization in Multicore Systems," 2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), Dublin, Ireland, 2024, pp. 35-43, doi: 10.1109/PDP62718.2024.00014
Abstract
The chip multicore processor (CMP) architecture has become the predominant design choice for contemporary general-purpose systems across multiple sectors of commercial technology. Thanks to technological progress, CMP systems can now feature hundreds of cores. While multithreaded applications may potentially benefit from the increasing core counts, leveraging all available cores is not always feasible due to limited Thread-Level Parallelism (TLP), load imbalance among threads, and other scalability bottlenecks. Colocating multiple applications on the same node is becoming a popular practice to maximize processor utilization. In HPC, malleability -the ability to dynamically alter the number of active threads within the same application-, is also being exploited at the runtime-system level to better deal with scenarios exhibiting time-varying scalability. In the cloud, application colocation is leveraged along with different forms of coarse-grained elasticity to cater to the varying resource demands. This work introduces an operating system (OS) level elastic mechanism designed to efficiently leverage idle CPU periods in workloads consisting of unmodified applications, many of which do not rely on a runtime system to function. This mechanism constitutes a form of fine-grained vertical elasticity that leverages cooperation between the runtime sys-tem and the OS to maximize CPU utilization. To this end, it opportunistically increases the active thread count of mal-leable applications during idle periods. We implemented our proposed OS extensions in the Linux kernel, and augmented the GNU's OpenMP runtime to show a proof of concept of the required OS-runtime interaction. By using diverse multi- threaded programs, we demonstrate the ability of the proposed OS support to substantially improve the system throughput.