An important but difficult goal of compiler optimization research is to generate efficient code for applications that operate on large datasets. This is particularly true for out-of-core codes that deal with very large quantities of disk-resident data. Writing an out-of-core version of a given application is more than just increasing the dataset size and extending the loop bounds. It requires careful choreographing of the flow of data between disk storage and main memory, partitioning of the available main memory space among datasets, and restructuring of code using techniques such as loop permutation and iteration space tiling. This article describes transformation techniques for out-of-core programs based on exploiting locality using a combination of loop and data transformations. More specifically, we describe how an optimizing compiler can improve the performance of the code by determining appropriate file layouts for out-of-core arrays, and finding suitable loop transformations in a unified framework. In addition to optimizing a single loop nest, our solution can handle a sequence of loop nests. We also show how to generate code when the file layouts are optimized and how to generalize the technique to an interprocedural setting. Experimental results obtained on a distributed-memory, message-passing multiprocessor machine demonstrate marked improvements in performance due to the optimizations described in this work.
|Number of pages
|International Journal of Parallel and Distributed Systems and Networks
|Published - 2002
All Science Journal Classification (ASJC) codes
- Hardware and Architecture