Improving last level cache locality by integrating loop and data transformations

Wei Ding, Mahmut Kandemir

Research output: Contribution to journalConference articlepeer-review

5 Scopus citations


Motivated by the observation that most existing data locality optimizations do not specifically target shared last-level caches of emerging multicores and that even multicore-specific locality-oriented techniques employ either loop or data layout optimizations but not both, in this paper we present an integrated loop and data layout optimization strategy, with the goal of improving the last-level cache performance of multicores that execute multithreaded applications. We present a detailed mathematical formulation of our locality optimization strategy and present experimental data from our current implementation. Our results, collected using 14 application programs, clearly show that the proposed integrated approach is very successful in practice, and outperforms both pure loop optimization and pure data layout optimization based alternatives. Our results also indicate that the savings achieved increase with increased core count and larger data set sizes.

Original languageEnglish (US)
Article number6386590
Pages (from-to)65-72
Number of pages8
JournalIEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD
StatePublished - 2012
Event2012 30th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2012 - San Jose, CA, United States
Duration: Nov 5 2012Nov 8 2012

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design


Dive into the research topics of 'Improving last level cache locality by integrating loop and data transformations'. Together they form a unique fingerprint.

Cite this