One of the key challenges facing computer architects and compiler writers is the increasing discrepancy between processor cycle times and main memory access times. To alleviate this problem in array-intensive embedded signal and video processing applications, compilers may employ either control-centric transformations that change data access patterns of nested loops or data-centric transformations that modify memory layouts of multidimensional arrays. Most of the memory layout optimizations proposed so far either modify the layout of each array independently or are based on explicit data reorganizations at runtime. This paper focuses on a compiler technique, called array regrouping, that automatically maps multiple arrays into a single data (array) space to improve data access pattern. We present a mathematical framework that enables us to systematically derive suitable mappings for a given array-intensive embedded application. The framework divides the arrays accessed in a given program into several groups and each group is independently layout-transformed to improve spatial locality and reduce the number of conflict misses. As compared to the previous approaches, the proposed technique makes two new contributions: 1) It presents a graph based formulation of the array regrouping problem and 2) it demonstrates potential benefits of this aggressive array-regrouping strategy in optimizing behavior of embedded systems. Extensive experimental results demonstrate significant improvements in cache miss rates and execution times. An important advantage of this approach over the previous techniques that target conflict misses is that it reduces conflict misses without increasing the data space requirements of the application being optimized. This is a very desirable property in many embedded/portable environments where data space requirements determine the minimum physical memory capacity. In addition to performance related issues, with the increased use of embedded/portable devices, improving energy efficiency of applications is becoming a critical issue. To develop a truly energy-efficient system, energy constraints should be taken into account early in the design process, i.e., at the source level in software compilation and behavioral level in hardware compilation. Source-level optimizations are particularly important in data-dominated media applications. In this paper, we also show how our array regrouping strategy increases energy savings from using multiple low-power operating modes provided in current memory modules. Using a set of array-intensive benchmarks, we observe significant savings in memory system energy.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Hardware and Architecture
- Computational Theory and Mathematics