TY - GEN
T1 - Re-NUCA
T2 - 30th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2016
AU - Kotra, Jagadish B.
AU - Arjomand, Mohammad
AU - Guttman, Diana
AU - Kandemir, Mahmut T.
AU - Das, Chita R.
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/7/18
Y1 - 2016/7/18
N2 - Although resistive RAM (ReRAM) technology offers a good combination of high capacity and low-power for cache memories, its long write latency and low endurance are potential showstoppers to its wide commercial adoption. In particular, its low write-endurance can cause fast wear-out of cache lines, bringing reliability issues and leading to capacity reduction over time. This problem is exacerbated when ReRAM cache has dynamic NUCA structure, where each core brings most of its data to the cache banks close to itself and writes become localized. We propose Re-NUCA, a NUCA architecture design for ReRAM cache to address its lifetime problem while keeping its performance high. Re-NUCA relies on performance-wise data criticality: if it realizes a cache line is performance critical, it keeps it in the banks close to the target core, like dynamic NUCA, otherwise, it maps cache lines onto banks using static NUCA to evenly distribute writes over cache banks. This change in mapping of cache lines to banks relaxes the lifetime problem in ReRAM NUCA significantly and wear-levels the lifetime of banks. Re-NUCA needs a logic for detecting performance-wise critical cache lines and a low-overhead changes in TLB for keeping mapping information. Our experimental results of a 16-core chip multiprocessor with 32MB ReRAM L3 cache show that Re-NUCA improves the lifetime of the non-volatile cache by about 42%, on average, with almost no impact on performance.
AB - Although resistive RAM (ReRAM) technology offers a good combination of high capacity and low-power for cache memories, its long write latency and low endurance are potential showstoppers to its wide commercial adoption. In particular, its low write-endurance can cause fast wear-out of cache lines, bringing reliability issues and leading to capacity reduction over time. This problem is exacerbated when ReRAM cache has dynamic NUCA structure, where each core brings most of its data to the cache banks close to itself and writes become localized. We propose Re-NUCA, a NUCA architecture design for ReRAM cache to address its lifetime problem while keeping its performance high. Re-NUCA relies on performance-wise data criticality: if it realizes a cache line is performance critical, it keeps it in the banks close to the target core, like dynamic NUCA, otherwise, it maps cache lines onto banks using static NUCA to evenly distribute writes over cache banks. This change in mapping of cache lines to banks relaxes the lifetime problem in ReRAM NUCA significantly and wear-levels the lifetime of banks. Re-NUCA needs a logic for detecting performance-wise critical cache lines and a low-overhead changes in TLB for keeping mapping information. Our experimental results of a 16-core chip multiprocessor with 32MB ReRAM L3 cache show that Re-NUCA improves the lifetime of the non-volatile cache by about 42%, on average, with almost no impact on performance.
UR - http://www.scopus.com/inward/record.url?scp=84983332087&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84983332087&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2016.79
DO - 10.1109/IPDPS.2016.79
M3 - Conference contribution
AN - SCOPUS:84983332087
T3 - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
SP - 576
EP - 585
BT - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 23 May 2016 through 27 May 2016
ER -