Abstract
In a network-on-chip (NoC) based manycore architecture, an offchip data access (main memory access) needs to travel through the on-chip network, spending considerable amount of time within the chip (in addition to the memory access latency). In addition, it contends with on-chip (cache) accesses as both use the same NoC resources. In this paper, focusing on data-parallel, multithreaded applications, we propose a compiler-based off-chip data access localization strategy, which places data elements in the memory space such that an off-chip access traverses a minimum number of links (hops) to reach the memory controller that handles this access. This brings three main benefits. First, the network latency of off-chip accesses gets reduced; second, the network latency of onchip accesses gets reduced; and finally, the memory latency of offchip accesses improves, due to reduced queue latencies.We present an experimental evaluation of our optimization strategy using a set of 13 multithreaded application programs under both private and shared last-level caches. The results collected emphasize the importance of optimizing the off-chip data accesses.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 131-142 |
| Number of pages | 12 |
| Journal | ACM SIGPLAN Notices |
| Volume | 50 |
| Issue number | 6 |
| DOIs | |
| State | Published - Jun 2015 |
All Science Journal Classification (ASJC) codes
- General Computer Science
Fingerprint
Dive into the research topics of 'Optimizing off-chip accesses in multicores'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver