On improving performance and energy profiles of sparse scientific applications

Konrad Malkowski, Ingyu Lee, Padma Raghavan, Mary Jane Irwin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

In many scientific applications, the majority of the execution time is spent within a few basic sparse kernels such as sparse matrix vector multiplication (SMV). Such sparse kernels can utilize only a fraction of the available processing speed because of their relatively large number of data accesses per floating point operation, and limited data locality and data re-use. Algorithmic changes and tuning of codes through blocking and loop unrolling schemes can improve performance but such tuned versions are typically not available in benchmark suites such as the SPEC CFP 2000. In this paper, we consider sparse SMV kernels with different levels of tuning that are representative of this application space. We emulate certain memory subsystem optimizations using SimpleScalar and Wattch to evaluate improvements in performance and energy metrics. We also characterize how such an evaluation can be affected by the interplay between code tuning and memory subsystem optimizations. Our results indicate that the optimizations reduce execution time by over 40%, and the energy by over 85%, when used with power control modes of CPUs and caches. Furthermore, the relative impact of the same set of memory subsystem optimizations can vary significantly depending on the level of code tuning. Consequently, it may be appropriate to augment traditional benchmarks by tuned kernels typical of high performance sparse scientific codes to enable comprehensive evaluations of future systems.

Original languageEnglish (US)
Title of host publication20th International Parallel and Distributed Processing Symposium, IPDPS 2006
PublisherIEEE Computer Society
ISBN (Print)1424400546, 9781424400546
DOIs
StatePublished - 2006
Event20th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2006 - Rhodes Island, Greece
Duration: Apr 25 2006Apr 29 2006

Publication series

Name20th International Parallel and Distributed Processing Symposium, IPDPS 2006
Volume2006

Other

Other20th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2006
Country/TerritoryGreece
CityRhodes Island
Period4/25/064/29/06

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Fingerprint

Dive into the research topics of 'On improving performance and energy profiles of sparse scientific applications'. Together they form a unique fingerprint.

Cite this