A complexity-effective approach to alu bandwidth enhancement for instruction-level temporal redundancy

Angshuman Parashar, Sudhanva Gurumurthi, Anand Sivasubramaniam

Research output: Contribution to journalConference articlepeer-review

45 Scopus citations


Previous proposals for implementing instruction-level temporal redundancy in out-of-order cores have reported a performance degradation of upto 45% in certain applications compared to an execution which does not have any temporal redundancy. An important contributor to this problem is the insufficient number of ALUs for handling the amplified load injected into the core. At the same time, increasing the number of ALUs can increase the complexity of the issue logic, which has been pointed out to be one of the most timing critical components of the processor. This paper proposes a novel extension of a prior idea on instruction reuse to ease ALU bandwidth requirements in a complexity-effective way by exploiting certain interesting properties of a dual (temporally redundant) instruction stream. We present microarchitectural extensions necessary for implementing an instruction reuse buffer (IRB) and integrating this with the issue logic of a dual instruction stream superscalar core, and conduct extensive evaluations to demonstrate how well it can alleviate the ALU bandwidth problem. We show that on the average we can gain back nearly 50% of the IPC loss that occurred due to ALU bandwidth limitations for an instruction-level temporally redundant superscalar execution, and 23% of the overall IPC loss.

Original languageEnglish (US)
Pages (from-to)376-386
Number of pages11
JournalConference Proceedings - Annual International Symposium on Computer Architecture, ISCA
StatePublished - 2004
EventProceedings -31st Annual International Symposium on Computer Architecture - Munich, Germany
Duration: Jun 19 2004Jun 23 2004

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture


Dive into the research topics of 'A complexity-effective approach to alu bandwidth enhancement for instruction-level temporal redundancy'. Together they form a unique fingerprint.

Cite this