TY - GEN
T1 - Rethinking software runtimes for disaggregated memory
AU - Calciu, Irina
AU - Imran, M. Talha
AU - Puddu, Ivan
AU - Kashyap, Sanidhya
AU - Maruf, Hasan Al
AU - Mutlu, Onur
AU - Kolli, Aasheesh
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/4/19
Y1 - 2021/4/19
N2 - Disaggregated memory can address resource provisioning inefficiencies in current datacenters. Multiple software runtimes for disaggregated memory have been proposed in an attempt to make disaggregated memory practical. These systems rely on the virtual memory subsystem to transparently offer disaggregated memory to applications using a local memory abstraction. Unfortunately, using virtual memory for disaggregation has multiple limitations, including high overhead that comes from the use of page faults to identify what data to fetch and cache locally, and high dirty data amplification that comes from the use of page-granularity for tracking changes to the cached data (4KB or higher). In this paper, we propose a fundamentally new approach to designing software runtimes for disaggregated memory that addresses these limitations. Our main observation is that we can use cache coherence instead of virtual memory for tracking applications' memory accesses transparently, at cache-line granularity. This simple idea (1) eliminates page faults from the application critical path when accessing remote data, and (2) decouples the application memory access tracking from the virtual memory page size, enabling cache-line granularity dirty data tracking and eviction. Using this observation, we implemented a new software runtime for disaggregated memory that improves average memory access time by 1.7-5X and reduces dirty data amplification by 2-10X, compared to state-of-the-art systems.
AB - Disaggregated memory can address resource provisioning inefficiencies in current datacenters. Multiple software runtimes for disaggregated memory have been proposed in an attempt to make disaggregated memory practical. These systems rely on the virtual memory subsystem to transparently offer disaggregated memory to applications using a local memory abstraction. Unfortunately, using virtual memory for disaggregation has multiple limitations, including high overhead that comes from the use of page faults to identify what data to fetch and cache locally, and high dirty data amplification that comes from the use of page-granularity for tracking changes to the cached data (4KB or higher). In this paper, we propose a fundamentally new approach to designing software runtimes for disaggregated memory that addresses these limitations. Our main observation is that we can use cache coherence instead of virtual memory for tracking applications' memory accesses transparently, at cache-line granularity. This simple idea (1) eliminates page faults from the application critical path when accessing remote data, and (2) decouples the application memory access tracking from the virtual memory page size, enabling cache-line granularity dirty data tracking and eviction. Using this observation, we implemented a new software runtime for disaggregated memory that improves average memory access time by 1.7-5X and reduces dirty data amplification by 2-10X, compared to state-of-the-art systems.
UR - http://www.scopus.com/inward/record.url?scp=85104772812&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85104772812&partnerID=8YFLogxK
U2 - 10.1145/3445814.3446713
DO - 10.1145/3445814.3446713
M3 - Conference contribution
AN - SCOPUS:85104772812
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 79
EP - 92
BT - Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021
PB - Association for Computing Machinery
T2 - 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2021
Y2 - 19 April 2021 through 23 April 2021
ER -