TY - GEN
T1 - SPM conscious loop scheduling for embedded chip multiprocessors
AU - Xue, Liping
AU - Kandemir, Mahmut
AU - Chen, Guangyu
AU - Yemliha, Taylan
PY - 2006
Y1 - 2006
N2 - One of the major factors that can potentially slow down widespread use of embedded chip multiprocessors is lack of efficient software support. In particular, automated code parallelizers are badly needed since it is not realistic to expect an average programmer to parallelize a large complex embedded application over multiple processors, taking into account several factors at the same time such as code density, data locality, performance, power and code resilience. Especially, increasing use of software-managed SPM (scratch-pad memory) components in embedded systems require an SPM conscious code parallelization. Motivated by this observation, this paper proposes a novel compiler-based SPM conscious loop scheduling strategy for array/loop based embedded applications. This strategy tries to achieve two objectives. First, the sets of loop iterations assigned to different processors should approximately take the same amount of time to finish. Second, the set of iterations assigned to a processor should exhibit high data reuse. Satisfying these two objectives help us to minimize parallel execution time of the application at hand. The specific method adopted by our scheduling strategy to achieve these objectives is to distribute loop iterations across parallel processors in an SPM conscious manner. In this strategy, the compiler analyzes the loop, identifies the potential SPM hits and misses, and distributes loop iterations over processors such that the processors have more or less the same execution time. Our experimental results so far indicate that the proposed approach generates much better results than existing loop schedulers. Specifically, it brings 18.9%, 22.4%, and 11.1% improvements in parallel execution time (with a chip multiprocessor of 8 cores) over a previously proposed static scheduler, a dynamic scheduler, and an alternate locality-conscious scheduler, respectively.
AB - One of the major factors that can potentially slow down widespread use of embedded chip multiprocessors is lack of efficient software support. In particular, automated code parallelizers are badly needed since it is not realistic to expect an average programmer to parallelize a large complex embedded application over multiple processors, taking into account several factors at the same time such as code density, data locality, performance, power and code resilience. Especially, increasing use of software-managed SPM (scratch-pad memory) components in embedded systems require an SPM conscious code parallelization. Motivated by this observation, this paper proposes a novel compiler-based SPM conscious loop scheduling strategy for array/loop based embedded applications. This strategy tries to achieve two objectives. First, the sets of loop iterations assigned to different processors should approximately take the same amount of time to finish. Second, the set of iterations assigned to a processor should exhibit high data reuse. Satisfying these two objectives help us to minimize parallel execution time of the application at hand. The specific method adopted by our scheduling strategy to achieve these objectives is to distribute loop iterations across parallel processors in an SPM conscious manner. In this strategy, the compiler analyzes the loop, identifies the potential SPM hits and misses, and distributes loop iterations over processors such that the processors have more or less the same execution time. Our experimental results so far indicate that the proposed approach generates much better results than existing loop schedulers. Specifically, it brings 18.9%, 22.4%, and 11.1% improvements in parallel execution time (with a chip multiprocessor of 8 cores) over a previously proposed static scheduler, a dynamic scheduler, and an alternate locality-conscious scheduler, respectively.
UR - http://www.scopus.com/inward/record.url?scp=34047222785&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34047222785&partnerID=8YFLogxK
U2 - 10.1109/ICPADS.2006.102
DO - 10.1109/ICPADS.2006.102
M3 - Conference contribution
AN - SCOPUS:34047222785
SN - 0769526128
SN - 9780769526126
T3 - Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
SP - 8
EP - 15
BT - Proceedings - 12th International Conference on Parallel and Distributed Systems, ICPADS 2006
PB - IEEE Computer Society
T2 - 12th International Conference on Parallel and Distributed Systems, ICPADS 2006
Y2 - 12 July 2006 through 15 July 2006
ER -