Cashing in on hints for better prefetching and caching in PVFS and MPI-IO

Christina M. Patrick, Mahmut Kandemir, Mustafa Karaköy, Seung Woo Son, Alok Choudhary

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Scopus citations

Abstract

In this work, we propose, implement and test a novel approach to the management of parallel I/O in high-performance computing. Our proposed approach is built upon three complementary ideas: (i) allowing users to place hints into the application code indicating high-level data access patterns, (ii) enabling an optimizing compiler to process these hints and develop I/O optimization strategies, and (iii) enhancing the I/O stack to accept these optimizations and process them across the different layers in the stack. We describe a general hint processing framework that accommodates this approach and demonstrate its potential by applying it to two sample problems: (i) shared storage cache management and (ii) I/O prefetching. In the former, our approach decides, at each program point of interest, the ideal set of data blocks to keep in shared storage caches in the I/O stack, and in the latter, the high-level data access pattern is propagated from application layer to the parallel file system layer for prefetching data from the storage subsystem. Our approach is designed to complement and work synergistically with the MPI-IO and PVFS frameworks and exploits the characteristics of applications written using these software. We tested our approach using both synthetic data access patterns and disk I/O intensive application programs. The results collected indicate that the proposed approach improves over existing storage caching and I/O prefetching schemes by 28% and 66%, respectively.

Original languageEnglish (US)
Title of host publicationHPDC 2010 - Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Pages191-202
Number of pages12
DOIs
StatePublished - 2010
Event19th ACM International Symposium on High Performance Distributed Computing, HPDC 2010 - Chicago, IL, United States
Duration: Jun 21 2010Jun 25 2010

Publication series

NameHPDC 2010 - Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing

Other

Other19th ACM International Symposium on High Performance Distributed Computing, HPDC 2010
Country/TerritoryUnited States
CityChicago, IL
Period6/21/106/25/10

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Cashing in on hints for better prefetching and caching in PVFS and MPI-IO'. Together they form a unique fingerprint.

Cite this