Abstract
There has been relatively little analytical work on processor optimizations for multimedia applications. With the introduction of MMX by Intel, it is clear that this is an area of increasing importance. Building on previous work, we propose optimizations for multimedia architectures that support independent parallel execution of instructions within dynamically assembled traces, resulting in dramatic performance improvements. Specifically, we propose simplified instruction scheduling and register renaming algorithms due to constraints on trace formation. In addition, we suggest specific instruction pool and trace cache parameters. We constructed a simulator in order to measure the benefits of these processor optimizations for multimedia applications. The simulated machine, which could fetch/decode 2 instructions per cycle, performed better than a superscalar machine that could fetch/decode 8 instructions per cycle. Execution rates as high as 7.3 instructions per cycle were achieved for the benchmarks simulated, assuming 16 instructions per trace.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the International Parallel Processing Symposium, IPPS |
Publisher | IEEE Comp Soc |
Pages | 640-646 |
Number of pages | 7 |
ISBN (Print) | 0818684046 |
DOIs | |
State | Published - 1998 |
Event | Proceedings of the 1998 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing - Orlando, FL, USA Duration: Mar 30 1998 → Apr 3 1998 |
Other
Other | Proceedings of the 1998 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing |
---|---|
City | Orlando, FL, USA |
Period | 3/30/98 → 4/3/98 |
All Science Journal Classification (ASJC) codes
- Hardware and Architecture