TY - GEN
T1 - Intra-application cache partitioning
AU - Muralidhara, Sai Prashanth
AU - Kandemir, Mahmut
AU - Raghavan, Padma
PY - 2010
Y1 - 2010
N2 - Efficient management of shared on-chip resources such as the shared level 2 (L2) cache has become an important problem with the emergence of chip multiprocessors (CMPs). Partitioning the shared cache in chip multiprocessors (CMPs) among concurrently executing applications can provide important benefits such as throughput improvement, fairness guarantees, and quality of service (QoS) enhancements. In this paper, we pose an interesting related question, which is, if partitioning the shared cache space among concurrently executing threads of the same application can enhance the application performance. We address this problem by identifying and speeding up the slowest thread, also termed as the critical path thread, during each execution interval since the overall performance of a multithreaded application is determined by the critical path thread. To do so, we propose a dynamic, runtime system based, cache partitioning scheme that partitions the shared cache space dynamically among the individual threads of a given application. In a nutshell, we wish to take some cache space away from the faster threads and give it to the critical path thread at each execution interval. We show that speeding up the critical path thread this way, results in overall performance enhancement of the application execution in the long term. Our experimental evaluation indicates that, the proposed dynamic cache partitioning scheme yields benefits up to 15% over a shared cache with no partitions, up to 23% over a statically partitioned cache (private cache) and up to 20% over a throughput-oriented scheme.
AB - Efficient management of shared on-chip resources such as the shared level 2 (L2) cache has become an important problem with the emergence of chip multiprocessors (CMPs). Partitioning the shared cache in chip multiprocessors (CMPs) among concurrently executing applications can provide important benefits such as throughput improvement, fairness guarantees, and quality of service (QoS) enhancements. In this paper, we pose an interesting related question, which is, if partitioning the shared cache space among concurrently executing threads of the same application can enhance the application performance. We address this problem by identifying and speeding up the slowest thread, also termed as the critical path thread, during each execution interval since the overall performance of a multithreaded application is determined by the critical path thread. To do so, we propose a dynamic, runtime system based, cache partitioning scheme that partitions the shared cache space dynamically among the individual threads of a given application. In a nutshell, we wish to take some cache space away from the faster threads and give it to the critical path thread at each execution interval. We show that speeding up the critical path thread this way, results in overall performance enhancement of the application execution in the long term. Our experimental evaluation indicates that, the proposed dynamic cache partitioning scheme yields benefits up to 15% over a shared cache with no partitions, up to 23% over a statically partitioned cache (private cache) and up to 20% over a throughput-oriented scheme.
UR - http://www.scopus.com/inward/record.url?scp=77954012717&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954012717&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2010.5470416
DO - 10.1109/IPDPS.2010.5470416
M3 - Conference contribution
AN - SCOPUS:77954012717
SN - 9781424464432
T3 - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010
BT - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010
T2 - 24th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2010
Y2 - 19 April 2010 through 23 April 2010
ER -