Bandwidth constrained coordinated HW/SW prefetching for multicores

Sai Prashanth Muralidhara, Mahmut Kandemir, Yuanrui Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Prefetching is a highly effective latency hiding technique that can greatly improve application performance. However, aggressive prefetching can potentially stress the off-chip bandwidth. The resulting bandwidth stalls can potentially negate the performance gain due to prefetching. In this paper, focusing on a multicore environment, we first study the comparative benefits of hardware and software prefetching and analyze if the two are complimentary or redundant. This analysis also evaluates different aggressiveness levels of hardware prefetching. Secondly, we weigh the positive performance benefits of prefetching against the negative performance effects of bandwidth stalls. Thirdly, we propose a hierarchical prefetch management scheme for multicores that controls the prefetch levels such that the overall performance gain is improved. Lastly, we show that our proposed off-chip bandwidth aware prefetch management scheme is very effective in practice, leading to performance gains of upto about 10% in system throughput over a bandwidth agnostic prefetching scheme.

Original languageEnglish (US)
Title of host publicationEuro-Par 2011 Parallel Processing - 17th International Conference, Proceedings
Number of pages16
EditionPART 1
StatePublished - 2011
Event17th International Conference on Parallel Processing, Euro-Par 2011 - Bordeaux, France
Duration: Aug 29 2011Sep 2 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6852 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other17th International Conference on Parallel Processing, Euro-Par 2011

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Bandwidth constrained coordinated HW/SW prefetching for multicores'. Together they form a unique fingerprint.

Cite this