Processor-embedded disks, or smart disks, with their network interface controller, can in effect be viewed as processing elements with on-disk memory and secondary storage. The data sizes and access patterns of today's large I/O-intensive workloads require architectures whose processing power scales with increased storage capacity. To address this concern, we propose and evaluate disk-based distributed smart storage architectures. Based on analytically derived performance models, our evaluation with representative workloads show that offloading processing and performing point-to-point data communication improve performance over centralized architectures. Our results also demonstrate that distributed smart disk systems exhibit desirable scalability and can efficiently handle I/O-intensive workloads, such as commercial decision support database (TPC-H) queries, association rules mining, data clustering, and two-dimensional fast Fourier transform, among others.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence