RT Journal Article T1 Reuse detector: improving the management of STT-RAM SLLCs A1 Rodríguez Rodríguez, Roberto Alonso A1 Díaz, Javier A1 Castro Rodríguez, Fernando A1 Ibáñez, Pablo A1 Chaver Martínez, Daniel Ángel A1 Viñals, Víctor A1 Sáez Alcaide, Juan Carlos A1 Prieto Matías, Manuel A1 Piñuel Moreno, Luis A1 Monreal, Teresa A1 Llabería, José María AB Various constraints of Static Random Access Memory (SRAM) are leading to consider new memory technologies as candidates for building on-chip shared last-level caches (SLLCs). Spin-Transfer Torque RAM (STT-RAM) is currently postulated as the prime contender due to its better energy efficiency, smaller die footprint and higher scalability. However, STT-RAM also exhibits some drawbacks, like slow and energy-hungry write operations that need to be mitigated before it can be used in SLLCs for the next generation of computers. In this work, we address these shortcomings by leveraging a new management mechanism for STT-RAM SLLCs. This approach is based on the previous observation that although the stream of references arriving at the SLLC of a Chip MultiProcessor (CMP) exhibits limited temporal locality, it does exhibit reuse locality, i.e. those blocks referenced several times manifest high probability of forthcoming reuse. As such, conventional STT-RAM SLLC management mechanisms, mainly focused on exploiting temporal locality, result in low efficient behavior. In this paper, we employ a cache management mechanism that selects the contents of the SLLC aimed to exploit reuse locality instead of temporal locality. Specifically, our proposal consists in theinclusion of a Reuse Detector (RD) between private cache levels and the STT-RAM SLLC. Its mission is to detect blocks that do not exhibit reuse, in order to avoid their insertion in the SLLC, hence reducing the number of write operations and the energy consumption in the STT-RAM. Our evaluation, using multiprogrammed workloads in quad-core, eight-core and 16-core systems, reveals that our scheme reports on average, energy reductions in the SLLC in the range of 37–30%, additional energy savings in the main memory in the range of 6–8% and performance improvements of 3% (quadcore), 7% (eight-core) and 14% (16-core) compared with an STT-RAM SLLC baseline where no RD is employed. More importantly, our approach outperforms DASCA, the state-of-the-art STT-RAM SLLC management, reporting —depending on the specific scenario and the kind of applications used— SLLC energy savings in the range of 4–11% higher than those of DASCA, delivering higher performance in the range of 1.5–14% and additional improvements in DRAM energy consumption in the range of 2–9% higher than DASCA. PB Oxford University Press SN 0010-4620 YR 2018 FD 2018-06 LK https://hdl.handle.net/20.500.14352/94365 UL https://hdl.handle.net/20.500.14352/94365 LA eng NO R Rodríguez-Rodríguez, J Díaz, F Castro, P Ibáñez, D Chaver, V Viñals, J C Saez, M Prieto-Matias, L Piñuel, T Monreal, J M Llabería, Reuse Detector: Improving the Management of STT-RAM SLLCs, The Computer Journal, Volume 61, Issue 6, June 2018, Pages 856–880, https://doi.org/10.1093/comjnl/bxx099 NO Esta depositada la versión postprint del artículo NO Gobierno de España NO HIPEAC-4 European Network of Excellence NO Costa Rican Ministry of Science and Technology (MICIT) NO National Council for Scientific and Technological Research (CONICIT) DS Docta Complutense RD 9 abr 2025