Interprocedural optimizations for improving data cache performance of array-intensive embedded applications

W. Zhang, G. Chen, M. Kandemir, M. Karakoy

Research output: Contribution to journalConference articlepeer-review

4 Scopus citations


As datasets processed by embedded processors increase in size and complexity, the management of higher levels of memory hierarchy (e.g., caches) is becoming an important issue. A major limitation of most of the cache locality optimization techniques proposed by previous research is that they handle a single procedure at a time. This prevents compilers from capturing the data access interactions between procedures and may result in poor performance. In this paper, we look at loop and data transformations from a different angle and use them in an interprocedural optimization framework. Employing the call graph representation of a given application, the proposed technique visits each node of this graph twice and uses loop and data transformations in a systematic way for optimizing array layouts whole program wide. Our experimental results show that this interprocedural locality optimization strategy is much more effective than the previous locality-based techniques that handle each procedure in isolation.

Original languageEnglish (US)
Pages (from-to)887-892
Number of pages6
JournalProceedings - Design Automation Conference
StatePublished - 2003
EventProceedings of the 40th Design Automation Conference - Anaheim, CA, United States
Duration: Jun 2 2003Jun 6 2003

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Control and Systems Engineering


Dive into the research topics of 'Interprocedural optimizations for improving data cache performance of array-intensive embedded applications'. Together they form a unique fingerprint.

Cite this