Intelligent File System Interfaces for Computational Storage Devices

  • Kandemir, Mahmut M. (PI)
  • Aghayev, Abutalib (CoPI)

Project: Research project

Project Details


The overarching goal of this project is to enable computational storage in the context of large-scale HPC applications that employ data analytics. To achieve this goal, we expose computational storage devices (CSDs) to software, design novel file system interfaces that can take advantage of CSDs, and implement computational storage-specific optimizations in a publicly-available parallel file system to minimize data movements between host machine and CSDs as well as among CSDs themselves. While the hardware employed in HPC datacenters have witnessed significant strides over the last few decades, the 'physical separation' of computation (performed in CPUs and GPUs) from data (stored in HDDs and SSDs) is becoming an increasingly performance-limiting factor due to the excessive data movement costs it entails. Thus, there is a motivation for HPC systems to embrace new technologies, storage systems, and software frameworks to minimize data movement costs. One such promising technology posed to reduce data movement costs is computational storage, which is a full-fledged compute platform that includes compute elements, memory, and storage as well as system software put into an SSD. It is important to emphasize that computational storage enables what can be termed as Near-Data Computation, i.e., performing computation 'near storage' instead of the conventional approach of bringing data to where computation is scheduled (host CPUs and GPUs). This project has five complementary thrusts: - Thrust-I: Exposing CSDs to Parallel File System and Application Programs. We expose CSDs to software stack via suitable hardware abstractions. The rationale for this is to be able to treat CSDs as 'first-class compute engines' just like CPUs and GPUs in an HPC datacenter. - Thrust-II: Novel File System Interfaces. We explore three types of new interfaces specifically designed for CSDs. The prescriptive interface gives the user explicit control in deciding where (host or CSDs) to execute offloaded functions and where to store the resulting data. In contrast, descriptive interface lets the file system decide the location of offloaded computations and the result data and performs various CSD-specific optimizations automatically. Finally, predictive interface allows select functions to execute on CSDs speculatively. - Thrust-III: Metadata Enrichment: We augment conventional parallel file system metadata with CSD-specific metadata to enable our optimizations. Our new metadata include characteristics of the functions offloaded to CSDs as well as performance metadata to keep track of function invocation/data access frequencies and affinity across functions. - Thrust-IV: CSD-specific Optimizations. We investigate CSD-specific file system optimizations. The primary goal of these optimizations is to minimize data movements while ensuring as much parallel execution on CSDs as possible. Our CSD-specific optimizations include, but not limited to, (i) performing data parallel computation on CSDs; (ii) deciding the ideal CSDs to execute a given offloaded function and deciding the ideal CSDs to store the resulting data; (iii) optimized execution of predictive calls; (iv) proactive data migration across CSDs; and (v) scheduling offloaded functions on limited resources in CSDs. - Thrust-V: Implementation, Evaluation, and Benchmarking. We implement our techniques in Lustre and test their effectiveness using a variety of HPC workloads and different types of CSDs. We build a CSD emulator using SPDK for driving the initial stages of our research. We maintain a public repository that includes the source codes for our CSD emulator and our patches to Lustre, as well as detailed experimental results. Our project is expected to (i) reduce 'distance-to-data' for analytics functions/kernels employed in HPC domain, and (ii) relieve the host CPUs and GPUs from the tasks of executing such analytics functions/kernels, i.e., it will free these resources for simulation alone. As a result, the application performance is expected to improve and the energy footprint of HPC datacenters will also reduce.
Effective start/end date9/1/228/31/25


  • Advanced Scientific Computing Research: $891,217.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.