TY - GEN
T1 - Application-aware memory system for fair and efficient execution of concurrent GPGPU applications
AU - Jog, Adwait
AU - Bolotin, Evgeny
AU - Guz, Zvika
AU - Parker, Mike
AU - Keckler, Stephen W.
AU - Kandemir, Mahmut T.
AU - Das, Chita R.
N1 - Copyright:
Copyright 2014 Elsevier B.V., All rights reserved.
PY - 2014
Y1 - 2014
N2 - The available computing resources in modern GPUs are growing with each new generation. However, as many general purpose applications with limited thread-scalability are tuned to take advantage of GPUs, available compute re- sources might not be optimally utilized. To address this, modern GPUs will need to execute multiple kernels simultaneously. As current generations of GPUs (e.g., NVIDIA Kepler, AMD Radeon) already enable concurrent execution of kernels from the same application, in this paper we ad- dress the next logical step: executing multiple concurrent applications in GPUs. We show that while this paradigm has a potential to improve the overall system performance, negative interactions among concurrently executing applications in the memory system can severely hamper the performance and fairness among applications. We show that the current application agnostic GPU memory system design can (1) lead to sub-optimal GPU performance; and (2) create significant imbalance in performance slowdowns across kernels. Thus, we argue that GPU memory system should be augmented with application awareness. As one example to the applicability of this concept, we augment the memory system hardware with application awareness such that requests from different applications can be scheduled in a round robin (RR) fashion while still preserving the benefits of the current first-ready FCFS (FR-FCFS) memory scheduling policy. Evaluations with different multi-application work- loads demonstrate that the proposed memory scheduling policy, first-ready round-robin FCFS (FR-RR-FCFS), improves fairness and delivers better system performance compared to the existing FR-FCFS memory scheduling scheme.
AB - The available computing resources in modern GPUs are growing with each new generation. However, as many general purpose applications with limited thread-scalability are tuned to take advantage of GPUs, available compute re- sources might not be optimally utilized. To address this, modern GPUs will need to execute multiple kernels simultaneously. As current generations of GPUs (e.g., NVIDIA Kepler, AMD Radeon) already enable concurrent execution of kernels from the same application, in this paper we ad- dress the next logical step: executing multiple concurrent applications in GPUs. We show that while this paradigm has a potential to improve the overall system performance, negative interactions among concurrently executing applications in the memory system can severely hamper the performance and fairness among applications. We show that the current application agnostic GPU memory system design can (1) lead to sub-optimal GPU performance; and (2) create significant imbalance in performance slowdowns across kernels. Thus, we argue that GPU memory system should be augmented with application awareness. As one example to the applicability of this concept, we augment the memory system hardware with application awareness such that requests from different applications can be scheduled in a round robin (RR) fashion while still preserving the benefits of the current first-ready FCFS (FR-FCFS) memory scheduling policy. Evaluations with different multi-application work- loads demonstrate that the proposed memory scheduling policy, first-ready round-robin FCFS (FR-RR-FCFS), improves fairness and delivers better system performance compared to the existing FR-FCFS memory scheduling scheme.
UR - http://www.scopus.com/inward/record.url?scp=84898819427&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84898819427&partnerID=8YFLogxK
U2 - 10.1145/2576779.2576780
DO - 10.1145/2576779.2576780
M3 - Conference contribution
AN - SCOPUS:84898819427
SN - 9781450327664
T3 - ACM International Conference Proceeding Series
SP - 1
EP - 8
BT - Proceedings of the 7th Workshop on General Purpose Processing Using Graphics Processing Units, GPGPU 2014
PB - Association for Computing Machinery
T2 - 7th Workshop on General Purpose Processing Using Graphics Processing Units, GPGPU 2014
Y2 - 1 March 2014 through 1 March 2014
ER -