Abstract
In this paper, we consider the increased performance that can be obtained by using, in concert, three previously proposed enhancements. These enhancements are aggressive dynamic (run time) instruction scheduling, the reuse of decoded instructions, and trace scheduling (both aggressive dynamic instruction scheduling and decoded instruction reuse have been used in commercial systems). We show that these three enhancements complement and support one another. Hence, while each of these enhancements has been shown to have merit in its own right, when used in concert, we claim the overall advantage is greater than that obtained by using any one singly. To support this claim, we present the results from running benchmarks representing several common multimedia kernels. Subsequent simulations show results of 7.3 instructions completed per cycle for the best-performing benchmark for a reasonably aggressive microarchitecture that combines trace scheduling of decoded instructions (i.e., decoded traces) with aggressive dynamic execution.
Original language | English (US) |
---|---|
Pages (from-to) | 65-75 |
Number of pages | 11 |
Journal | Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology |
Volume | 22 |
Issue number | 1 |
DOIs | |
State | Published - 1999 |
All Science Journal Classification (ASJC) codes
- Signal Processing
- Information Systems
- Electrical and Electronic Engineering