TY - GEN
T1 - D-SOAP
T2 - 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020
AU - Liao, Minli Julie
AU - Sampson, Jack
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10
Y1 - 2020/10
N2 - Previous works have shown the possibility of constructing row-column and multi-stride memory systems that can exploit simultaneously dense access along multiple logical data orientations to offer more than 3x speedups on some workloads. However, existing multi-orientation memory (MOM) and MOM-caching approaches presume that the orientation preference of a memory request is statically determinable and rely on both ISA and compiler changes to express and extract these preferences for performance gains. Thus, current MOM-caching approaches cannot readily provide benefits in the presence of dynamism with respect to data layout, data-dependent code behavior, or access ordering. Accurate orientation prediction will allow MOMs to benefit a larger range of workloads.In this paper, we describe the sources of orientation preference dynamism and show that the sensitivity of orientation prediction to cache line utilization, as well as to access pattern, differentiates it from stride prediction. We introduce a hardware-managed utilization-focused orientation predictor, D-SOAP, and compare it with a set variants (D-SOAP-*) that make use of utilization, local stride analysis, and prefetcher feedback as sources of information, both in isolation and in combination, to predict orientation preference and evaluate the impact of each information source. We evaluate the D-SOAP mechanisms on workloads with both dynamic, data-value-dependent (DVD) and statically identifiable data-value-independent (DVI) orientation preferences. We demonstrate that the D-SOAP variants using utilization information 1) track the performance of the preferred orientation within 2%, on average, and 18%, at worst, across microbenchmarks sweeping data distributions for DVD patterns, avoiding the up to 267% slowdown seen with misaligned static preferences; 2) provide competitive (4% speedup, on average) performance, compared to prior static annotation approaches relying on a priori data profiling, in DVD scenarios; 3) closely track static annotations for DVI scenarios that lack any exploitable dynamism (1.8% speedup, on average).
AB - Previous works have shown the possibility of constructing row-column and multi-stride memory systems that can exploit simultaneously dense access along multiple logical data orientations to offer more than 3x speedups on some workloads. However, existing multi-orientation memory (MOM) and MOM-caching approaches presume that the orientation preference of a memory request is statically determinable and rely on both ISA and compiler changes to express and extract these preferences for performance gains. Thus, current MOM-caching approaches cannot readily provide benefits in the presence of dynamism with respect to data layout, data-dependent code behavior, or access ordering. Accurate orientation prediction will allow MOMs to benefit a larger range of workloads.In this paper, we describe the sources of orientation preference dynamism and show that the sensitivity of orientation prediction to cache line utilization, as well as to access pattern, differentiates it from stride prediction. We introduce a hardware-managed utilization-focused orientation predictor, D-SOAP, and compare it with a set variants (D-SOAP-*) that make use of utilization, local stride analysis, and prefetcher feedback as sources of information, both in isolation and in combination, to predict orientation preference and evaluate the impact of each information source. We evaluate the D-SOAP mechanisms on workloads with both dynamic, data-value-dependent (DVD) and statically identifiable data-value-independent (DVI) orientation preferences. We demonstrate that the D-SOAP variants using utilization information 1) track the performance of the preferred orientation within 2%, on average, and 18%, at worst, across microbenchmarks sweeping data distributions for DVD patterns, avoiding the up to 267% slowdown seen with misaligned static preferences; 2) provide competitive (4% speedup, on average) performance, compared to prior static annotation approaches relying on a priori data profiling, in DVD scenarios; 3) closely track static annotations for DVI scenarios that lack any exploitable dynamism (1.8% speedup, on average).
UR - https://www.scopus.com/pages/publications/85097354604
UR - https://www.scopus.com/inward/citedby.url?scp=85097354604&partnerID=8YFLogxK
U2 - 10.1109/MICRO50266.2020.00055
DO - 10.1109/MICRO50266.2020.00055
M3 - Conference contribution
AN - SCOPUS:85097354604
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 581
EP - 595
BT - Proceedings - 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020
PB - IEEE Computer Society
Y2 - 17 October 2020 through 21 October 2020
ER -