TY - GEN
T1 - Memory optimizations for fast power-aware sparse computations
AU - Malkowski, Konrad
AU - Raghavan, Padma
AU - Irwin, Mary Jane
PY - 2007
Y1 - 2007
N2 - We consider memory subsystem optimizations for improving the performance of sparse scientific computation while reducing the power consumed by the CPU and memory. We first consider a sparse matrix vector multiplication kernel that is at the core of most sparse scientific codes, to evaluate the impact of prefetchers and power-saving modes of the CPU and caches. We show that performance can be improved at significantly lower power levels, leading to over a factor of five improvement in the operations/Joule metric of energy efficiency. We then indicate that these results extend to more complex codes such as a multigrid solver. We also determine a functional representation of the impacts of such optimizations and we indicate how it can be used toward further tuning. Our results thus indicate the potential for cross-layer tuning for multiobjective optimizations by considering both features of the application and the architecture.
AB - We consider memory subsystem optimizations for improving the performance of sparse scientific computation while reducing the power consumed by the CPU and memory. We first consider a sparse matrix vector multiplication kernel that is at the core of most sparse scientific codes, to evaluate the impact of prefetchers and power-saving modes of the CPU and caches. We show that performance can be improved at significantly lower power levels, leading to over a factor of five improvement in the operations/Joule metric of energy efficiency. We then indicate that these results extend to more complex codes such as a multigrid solver. We also determine a functional representation of the impacts of such optimizations and we indicate how it can be used toward further tuning. Our results thus indicate the potential for cross-layer tuning for multiobjective optimizations by considering both features of the application and the architecture.
UR - http://www.scopus.com/inward/record.url?scp=34548795820&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34548795820&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2007.370501
DO - 10.1109/IPDPS.2007.370501
M3 - Conference contribution
AN - SCOPUS:34548795820
SN - 1424409101
SN - 9781424409105
T3 - Proceedings - 21st International Parallel and Distributed Processing Symposium, IPDPS 2007; Abstracts and CD-ROM
BT - Proceedings - 21st International Parallel and Distributed Processing Symposium, IPDPS 2007; Abstracts and CD-ROM
T2 - 21st International Parallel and Distributed Processing Symposium, IPDPS 2007
Y2 - 26 March 2007 through 30 March 2007
ER -