RT Book, Section T1 Gem5-x: a gem5-based system level simulation framework to optimize many-core platforms A1 Mahmood Qureshi, Yasir A1 Simon, William Andrew A1 Zapater, Marina A1 Atienza, David A1 Olcoz Herrero, Katzalin AB The rapid expansion of online-based services requires novel energy and performance efficient architectures to meet power and latency constraints. Fast architectural exploration has become a key enabler in the proposal of architectural innovation. In this paper, we present gem5-X, a gem5-based system level simulation framework, and a methodology to optimize many-core systems for performance and power. As real-life case studies of many-core server workloads, we use real-time video transcoding and image classification using convolutional neural networks (CNNs). Gem5-X allows us to identify bottlenecks and evaluate the potential benefits of architectural extensions such as in-cache computing and 3D stacked High Bandwidth Memory. For real-time video transcoding, we achieve 15% speed-up using in-order cores with in-cache computing when compared to a baseline in-order system and 76% energy savings when compared to an Out-of-Order system. When using HBM, we further accelerate real-time transcoding and CNNs by up to 7% and 8% respectively. PB IEEE SN 978-1-5108-8388-8 YR 2019 FD 2019 LK https://hdl.handle.net/20.500.14352/14027 UL https://hdl.handle.net/20.500.14352/14027 LA eng NO ©2019 IEEESpring Simulation Conference (SpringSim) (2019. Tucson, Arizona)This work has been partially supported by the EC H2020 RECIPE (GA No. 801137) project, the ERC Consolidator Grant COMPUSAPIEN (GA No. 725657), the EU FEDER and the Spanish MINECO (GA No. TIN2015-65277-R). NO Unión Europea. H2020 NO Ministerio de Economía y Competitividad (MINECO)/FEDER DS Docta Complutense RD 6 abr 2025