Tera-scale high-performance computing has enabled scientists to tackle very large and computationally challenging problems, such as those found in the scientific computing domain. However, as computing scales to levels never seen before, it also becomes extremely data intensive and I/O intensive. Thus, I/O is becoming a major bottleneck, thereby slowing the expected pace of scientific discovery and analysis of data. Furthermore, in order to cope with larger problems and data sizes, models and applications are being designed to be dynamic in nature. That is, the applications are dynamic both in terms of their computation patterns as well as data access patterns. Due to the complexities of systems and applications, it is, therefore, very important to address research issues and develop dynamic techniques at the level of runtime systems and compilers to scale I/O in the right proportions.
The objectives of this project are to design and develop next generation software techniques to address the data, I/O, and storage bottlenecks for large-scale scientific applications. Particularly, this project aims to investigate dynamic runtime and compilation techniques for scalable I/O optimizations for largescale systems. Another important aspect will be to drive these optimizations by learning and characterizing performance of I/O and data accesses, and subsequently using those to develop rules that will be used by dynamic runtime and compilation systems to enable high-performance I/O. Current state-of-the-art compiler support for I/O-intensive applications is tremendously lacking. Runtime needs of many large-scale I/O-intensive applications can benefit a lot from a robust dynamic compilation and linking infrastructure. The specific objectives of this project are:
. Developing an understanding of dynamically varying data access needs of I/O-intensive applications,
. Capturing dynamic access patterns and application steering information at runtime within a metadata manager,
. Designing and implementing dynamic compilation techniques based on the runtime access patterns
and performance statistics collected by the metadata manager,
. Designing and implementing a layout manager that collects storage format (layout) suggestions from multiple concurrently executing applications and determines the globally acceptable storage layouts for disk-resident and tape-resident data,
. Designing and implementing a high-level, dynamic, easy-to-use I/O library that can be invoked by the dynamic compiler/linker,
. Investigating what types of user-specified hints can be passed to the runtime system/compiler, and
how they can be incorporated to reduce the overheads associated with dynamic compilation,
. Evaluating the performance of the developed dynamic compilation/linking infrastructure under
realistic I/O-intensive workloads and quantifying the runtime overheads associated with dynamic
. Providing the developed infrastructure and experimental findings in the public domain, and
incorporating the research findings into the undergraduate and graduate curriculum.
|Effective start/end date||8/1/04 → 7/31/08|
- National Science Foundation: $306,000.00