Abstract
Energy efficiency of computation is quickly becoming a key problem from the chip through the data center. This paper presents the first quantitative study of the potential energy efficiency of vector accelerators. We propose and study a vector accelerator architecture suitable for implementation in a 70nm technology. The vector architecture has a high-bandwidth on-chip cache system coupled to 16 independent memory channels. We show that such an accelerator can achieve speedups of 10X or more on loop kernels in comparison to a quad-issue superscalar uniprocessor, while using less energy. We also introduce run-ahead lanes, a complexity and energy efficient means of tolerating variable latency from crossbar contention, cache bank conflicts, cache misses, and the memory system. Run-ahead lanes only synchronize on dependencies or when explicitly directed.
| Original language | English (US) |
|---|---|
| Title of host publication | Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC'06 |
| DOIs | |
| State | Published - 2006 |
Publication series
| Name | Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC'06 |
|---|
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 7 Affordable and Clean Energy
All Science Journal Classification (ASJC) codes
- General Computer Science
Fingerprint
Dive into the research topics of 'The potential energy efficiency of vector acceleration'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver