TY - GEN
T1 - Congestion-aware memory management on NUMA platforms
T2 - 2017 IEEE International Symposium on Workload Characterization, IISWC 2017
AU - Kotra, Jagadish B.
AU - Kim, Seongbeom
AU - Madduri, Kamesh
AU - Kandemir, Mahmut T.
N1 - Funding Information:
The authors would like to acknowledge the support of NSF under the grants 1439021, 1439057, 1409095, 1626251, 1629915, 1629129 and 1526750.
Funding Information:
ACKNOWLEDGMENT The authors would like to acknowledge the support of NSF under the grants 1439021, 1439057, 1409095, 1626251, 1629915, 1629129 and 1526750. REFERENCES
Publisher Copyright:
© 2017 IEEE.
PY - 2017/12/5
Y1 - 2017/12/5
N2 - He VMware ESXi hypervisor attracts a wide range of customers and is deployed in domains ranging from desktop computing to server computing. While the software systems are increasingly moving towards consolidation, hardware has already transitioned into multi-socket Non-Uniform Memory Access (NUMA)-based systems. The marriage of increasing consolidation and the multi-socket based systems warrants low-overhead, simple and practical mechanisms to detect and address performance bottlenecks, without causing additional contention for shared resources such as performance counters. In this paper, we propose a simple, practical and highly accurate, dynamic memory latency probing mechanism to detect memory congestion in a NUMA system. Using these dynamic probed latencies, we propose congestion-aware memory allocation, congestion-aware memory migration, and a combination of these two techniques. These proposals, evaluated on Intel Westmere (8 nodes) and Intel Haswell (2 nodes) using various workloads, improve the overall performance on an average by 7.2% and 9.5% respectively.
AB - He VMware ESXi hypervisor attracts a wide range of customers and is deployed in domains ranging from desktop computing to server computing. While the software systems are increasingly moving towards consolidation, hardware has already transitioned into multi-socket Non-Uniform Memory Access (NUMA)-based systems. The marriage of increasing consolidation and the multi-socket based systems warrants low-overhead, simple and practical mechanisms to detect and address performance bottlenecks, without causing additional contention for shared resources such as performance counters. In this paper, we propose a simple, practical and highly accurate, dynamic memory latency probing mechanism to detect memory congestion in a NUMA system. Using these dynamic probed latencies, we propose congestion-aware memory allocation, congestion-aware memory migration, and a combination of these two techniques. These proposals, evaluated on Intel Westmere (8 nodes) and Intel Haswell (2 nodes) using various workloads, improve the overall performance on an average by 7.2% and 9.5% respectively.
UR - http://www.scopus.com/inward/record.url?scp=85046552331&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046552331&partnerID=8YFLogxK
U2 - 10.1109/IISWC.2017.8167772
DO - 10.1109/IISWC.2017.8167772
M3 - Conference contribution
AN - SCOPUS:85046552331
T3 - Proceedings of the 2017 IEEE International Symposium on Workload Characterization, IISWC 2017
SP - 146
EP - 155
BT - Proceedings of the 2017 IEEE International Symposium on Workload Characterization, IISWC 2017
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 1 October 2017 through 3 October 2017
ER -