Abstract
Query processing on large-scale scientific datasets often suffers from performance bottlenecks due to significant data transfers between storage nodes and applications in decoupled distributed storage environments. This issue is particularly pronounced in high-selectivity queries where unnecessary data is transferred between the storage plane and the compute plane. To tackle this challenge, we introduce the integration of SmartSSDs, functioning as Computational Storage Devices (CSDs), into the storage layer. By offloading simple filter-projection operations to these CSDs, we significantly reduce data transfer bottlenecks, leading to lower query latency and higher throughput. Our novel framework, CORD (parallelizing query processing across multiple Computational stORage Devices), facilitates parallel query execution across multiple CSDs while considering data locality. CORD is compatible with any decoupled storage system equipped with CSDs. Our extensive empirical evaluation demonstrates that CORD achieves up to 93 × speedup for high-selectivity queries compared to traditional (compute plane) execution strategy and offers a further 1.64 × speedup in cases of uneven data distribution. Additionally, we present two optimizations for batch query processing. Results from our experiments with 4 CSDs reveal substantial performance improvements provided by the optimizations embedded in CORD.
| Original language | English (US) |
|---|---|
| Title of host publication | Proceedings - 2025 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2025 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 1141-1153 |
| Number of pages | 13 |
| Edition | 2025 |
| ISBN (Electronic) | 9798331532376 |
| DOIs | |
| State | Published - 2025 |
| Event | 39th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2025 - Milan, Italy Duration: Jun 3 2025 → Jun 7 2025 |
Conference
| Conference | 39th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2025 |
|---|---|
| Country/Territory | Italy |
| City | Milan |
| Period | 6/3/25 → 6/7/25 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Computer Networks and Communications
- Computer Science Applications
- Hardware and Architecture
Fingerprint
Dive into the research topics of 'CORD: Parallelizing Query Processing Across Multiple Computational Storage Devices'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver