TY - JOUR
T1 - Adapting application execution in CMPs using helper threads
AU - Ding, Yang
AU - Kandemir, Mahmut
AU - Raghavan, Padma
AU - Irwin, Mary Jane
N1 - Funding Information:
Padma Raghavan is a Professor of Computer Science and Engineering and the Director of the Institute for CyberScience at the Pennsylvania State University. Raghavan conducts research in high performance computing, with sparsity as a unifying abstraction from computational science to computer architecture, toward increasing computational performance by constant factors to orders of magnitude. Raghavan received the CAREER award from NSF and the Maria-Geoppert Mayer Distinguished Scholar Award from the University of Chicago and the Argonne National Laboratory. Raghavan currently serves on the editorial board of the SIAM Journal on Scientific Computing and on the program committees of major conferences in parallel computing sponsored by ACM, IEEE and SIAM. Raghavan received her M.S. (1987) and Ph.D. (1992) degrees in computer science from the Pennsylvania State University.
Funding Information:
This work is supported in part by NSF grants CCF 0811687, OCI 0821527, CNS 0720645, CNS 0720749, CCF 0702519, and a grant from Microsoft Corporation. The authors also acknowledge the support of the Gigascale Systems Research Focus Center, one of five research centers funded under the Focus Center Research Program, a Semiconductor Research Corporation program. We thank anonymous reviewers for the helpful comments and suggestions.
Funding Information:
Mahmut Kandemir is an associate professor in the Department of Computer Science and Engineering at Penn State University. His research interests are in optimizing compilers, runtime systems, embedded systems, I/O and high performance storage, and power-aware computing. He has served in the program committees of 40 conferences and workshops. His research is funded by NSF, DARPA, and SRC. He is a recipient of NSF Career Award and the Penn State Engineering Society Outstanding Research Award. Kandemir received his Ph.D. (1999) in computer science from Syracuse University.
PY - 2009/9
Y1 - 2009/9
N2 - In parallel to the changes in both the architecture domain-the move toward chip multiprocessors (CMPs)-and the application domain-the move toward increasingly data-intensive workloads-issues such as performance, energy efficiency and CPU availability are becoming increasingly critical. The CPU availability can change dynamically due to several reasons such as thermal overload, increase in transient errors, or operating system scheduling. An important question in this context is how to adapt, in a CMP, the execution of a given application to CPU availability change at runtime. Our paper studies this problem, targeting the energy-delay product (EDP) as the main metric to optimize. We first discuss that, in adapting the application execution to the varying CPU availability, one needs to consider the number of CPUs to use, the number of application threads to accommodate and the voltage/frequency levels to employ (if the CMP has this capability). We then propose to use helper threads to adapt the application execution to CPU availability change in general with the goal of minimizing the EDP. The helper thread runs parallel to the application execution threads and tries to determine the ideal number of CPUs, threads and voltage/frequency levels to employ at any given point in execution. We illustrate this idea using four applications (Fast Fourier Transform, MultiGrid, LU decomposition and Conjugate Gradient) under different execution scenarios. The results collected through our experiments are very promising and indicate that significant EDP reductions are possible using helper threads. For example, we achieved up to 66.3%, 83.3%, 91.2%, and 94.2% savings in EDP when adjusting all the parameters properly in applications FFT, MG, LU, and CG, respectively. We also discuss how our approach can be extended to address multi-programmed workloads.
AB - In parallel to the changes in both the architecture domain-the move toward chip multiprocessors (CMPs)-and the application domain-the move toward increasingly data-intensive workloads-issues such as performance, energy efficiency and CPU availability are becoming increasingly critical. The CPU availability can change dynamically due to several reasons such as thermal overload, increase in transient errors, or operating system scheduling. An important question in this context is how to adapt, in a CMP, the execution of a given application to CPU availability change at runtime. Our paper studies this problem, targeting the energy-delay product (EDP) as the main metric to optimize. We first discuss that, in adapting the application execution to the varying CPU availability, one needs to consider the number of CPUs to use, the number of application threads to accommodate and the voltage/frequency levels to employ (if the CMP has this capability). We then propose to use helper threads to adapt the application execution to CPU availability change in general with the goal of minimizing the EDP. The helper thread runs parallel to the application execution threads and tries to determine the ideal number of CPUs, threads and voltage/frequency levels to employ at any given point in execution. We illustrate this idea using four applications (Fast Fourier Transform, MultiGrid, LU decomposition and Conjugate Gradient) under different execution scenarios. The results collected through our experiments are very promising and indicate that significant EDP reductions are possible using helper threads. For example, we achieved up to 66.3%, 83.3%, 91.2%, and 94.2% savings in EDP when adjusting all the parameters properly in applications FFT, MG, LU, and CG, respectively. We also discuss how our approach can be extended to address multi-programmed workloads.
UR - http://www.scopus.com/inward/record.url?scp=67651006179&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67651006179&partnerID=8YFLogxK
U2 - 10.1016/j.jpdc.2009.04.004
DO - 10.1016/j.jpdc.2009.04.004
M3 - Article
AN - SCOPUS:67651006179
SN - 0743-7315
VL - 69
SP - 790
EP - 806
JO - Journal of Parallel and Distributed Computing
JF - Journal of Parallel and Distributed Computing
IS - 9
ER -