TY - GEN
T1 - Compiler-directed variable latency aware SPM management to cope with timing problems
AU - Ozturk, O.
AU - Chen, G.
AU - Kandemir, M.
AU - Karakoy, M.
PY - 2007
Y1 - 2007
N2 - As a result of process parameter variations, a large variability in circuit delay occurs in scaled technologies. This delay or latency variation problem is particularly pressing for memory components due to the minimum sized transistors used to build them. Current memory design techniques mostly cope with such variations by adopting a worst-case design option, which simply assumes all memory locations are operated under the worst possible latency, whereas in reality some memory locations could be much faster than the others. Note that, assuming any other latency value other than the worst-case latency for all memory locations uniformly can lead to reliability problems, since the data may not be ready when the assumed latency has passed. Instead of operating under the worst-case design option, this paper proposes and experimentally evaluates a compiler-driven approach that operates an on-chip scratch-pad memory (SPM) assuming different latencies for the different SPM lines. Our goal is to reduce execution cycles without creating any reliability problems due to variations in access latencies. The proposed scheme achieves its goal by evaluating the reuse of different data items and adopting a reuse and latency aware data-to-SPMplacement. It also employs data migration within SPM when it helps to cut down the number of execution cycles further. We also discuss an alternate scheme that can reduce latency of select SPM locations by controlling a circuit level mechanism in software to further improve performance. We implemented our approach within an optimizing compiler and tested its effectiveness through extensive simulations. Our experiments with twelve embedded application codes show that the proposed approach performs much better than the worst-case based design paradigm (16.2% improvement on the average) and comes close (within 5.7%) to an hypothetical bestcase design (i.e., one with no process variation) where every SPM locations uniformly have low latency.
AB - As a result of process parameter variations, a large variability in circuit delay occurs in scaled technologies. This delay or latency variation problem is particularly pressing for memory components due to the minimum sized transistors used to build them. Current memory design techniques mostly cope with such variations by adopting a worst-case design option, which simply assumes all memory locations are operated under the worst possible latency, whereas in reality some memory locations could be much faster than the others. Note that, assuming any other latency value other than the worst-case latency for all memory locations uniformly can lead to reliability problems, since the data may not be ready when the assumed latency has passed. Instead of operating under the worst-case design option, this paper proposes and experimentally evaluates a compiler-driven approach that operates an on-chip scratch-pad memory (SPM) assuming different latencies for the different SPM lines. Our goal is to reduce execution cycles without creating any reliability problems due to variations in access latencies. The proposed scheme achieves its goal by evaluating the reuse of different data items and adopting a reuse and latency aware data-to-SPMplacement. It also employs data migration within SPM when it helps to cut down the number of execution cycles further. We also discuss an alternate scheme that can reduce latency of select SPM locations by controlling a circuit level mechanism in software to further improve performance. We implemented our approach within an optimizing compiler and tested its effectiveness through extensive simulations. Our experiments with twelve embedded application codes show that the proposed approach performs much better than the worst-case based design paradigm (16.2% improvement on the average) and comes close (within 5.7%) to an hypothetical bestcase design (i.e., one with no process variation) where every SPM locations uniformly have low latency.
UR - http://www.scopus.com/inward/record.url?scp=34547687545&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34547687545&partnerID=8YFLogxK
U2 - 10.1109/CGO.2007.6l
DO - 10.1109/CGO.2007.6l
M3 - Conference contribution
AN - SCOPUS:34547687545
SN - 0769527647
SN - 9780769527642
T3 - International Symposium on Code Generation and Optimization, CGO 2007
SP - 232
EP - 243
BT - International Symposium on Code Generation and Optimization, CGO 2007
T2 - International Symposium on Code Generation and Optimization, CGO 2007
Y2 - 11 March 2007 through 14 March 2007
ER -