Compiler directed network-on-chip reliability enhancement for chip multiprocessors

Ozcan Ozturk, Mahmut Kandemir, Mary J. Irwin, Sri H.K. Narayanan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Chip multiprocessors (CMPs) are expected to be the building blocks for future computer systems. While architecting these emerging CMPs is a challenging problem on its own, programming them is even more challenging. As the number of cores accommodated in chip multiprocessors increases, network-on-chip (NoC) type communication fabrics are expected to replace traditional point-to-point buses. Most of the prior software related work so far targeting CMPs focus on performance and power aspects. However, as technology scales, components of a CMP are being increasingly exposed to both transient and permanent hardware failures. This paper presents and evaluates a compiler-directed power-performance aware reliability enhancement scheme for network-on-chip (NoC) based chip multiprocessors (CMPs). The proposed scheme improves on-chip communication reliability by duplicating messages traveling across CMP nodes such that, for each original message, its duplicate uses a different set of communication links as much as possible (to satisfy performance constraint). In addition, our approach tries to reuse communication links across the different phases of the program to maximize link shutdown opportunities for the NoC (to satisfy power constraint). Our results show that the proposed approach is very effective in improving on-chip network reliability, without causing excessive power or performance degradation. In our experiments, we also evaluate the performance oriented and energy oriented versions of our compiler-directed reliability enhancement scheme, and compare it to two pure hardware based fault tolerant routing schemes.

Original languageEnglish (US)
Title of host publicationLCTES'10 - Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems
Pages85-94
Number of pages10
DOIs
StatePublished - 2010
EventACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems, LCTES 2010 - Stockholm, Sweden
Duration: Apr 13 2010Apr 15 2010

Publication series

NameProceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES)

Other

OtherACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems, LCTES 2010
Country/TerritorySweden
CityStockholm
Period4/13/104/15/10

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint

Dive into the research topics of 'Compiler directed network-on-chip reliability enhancement for chip multiprocessors'. Together they form a unique fingerprint.

Cite this