TY - GEN
T1 - StoRM
T2 - 12th ACM International Systems and Storage Conference, SYSTOR 2019
AU - Novakovic, Stanko
AU - Shan, Yizhou
AU - Kolli, Aasheesh
AU - Cui, Michael
AU - Zhang, Yiying
AU - Eran, Haggai
AU - Pismenny, Boris
AU - Liss, Liran
AU - Wei, Michael
AU - Tsafrir, Dan
AU - Aguilera, Marcos
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s).
PY - 2019/5/22
Y1 - 2019/5/22
N2 - RDMA technology enables a host to access the memory of a remote host without involving the remote CPU, improving the performance of distributed in-memory storage systems. Previous studies argued that RDMA suffers from scalability issues, because the NIC’s limited resources are unable to simultaneously cache the state of all the concurrent network streams. These concerns led to various software-based proposals to reduce the size of this state by trading off performance. We revisit these proposals and show that they no longer apply when using newer RDMA NICs in rack-scale environments. In particular, we find that one-sided remote memory primitives lead to better performance as compared to the previously proposed unreliable datagram and kernel-based stacks. Based on this observation, we design and implement Storm, a transactional dataplane utilizing one-sided read and write-based RPC primitives. We show that Storm outperforms eRPC, FaRM, and LITE by 3.3x, 3.6x, and 17.1x, respectively, on an InfiniBand cluster with Mellanox ConnectX-4 NICs.
AB - RDMA technology enables a host to access the memory of a remote host without involving the remote CPU, improving the performance of distributed in-memory storage systems. Previous studies argued that RDMA suffers from scalability issues, because the NIC’s limited resources are unable to simultaneously cache the state of all the concurrent network streams. These concerns led to various software-based proposals to reduce the size of this state by trading off performance. We revisit these proposals and show that they no longer apply when using newer RDMA NICs in rack-scale environments. In particular, we find that one-sided remote memory primitives lead to better performance as compared to the previously proposed unreliable datagram and kernel-based stacks. Based on this observation, we design and implement Storm, a transactional dataplane utilizing one-sided read and write-based RPC primitives. We show that Storm outperforms eRPC, FaRM, and LITE by 3.3x, 3.6x, and 17.1x, respectively, on an InfiniBand cluster with Mellanox ConnectX-4 NICs.
UR - http://www.scopus.com/inward/record.url?scp=85067102417&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85067102417&partnerID=8YFLogxK
U2 - 10.1145/3319647.3325827
DO - 10.1145/3319647.3325827
M3 - Conference contribution
AN - SCOPUS:85067102417
T3 - SYSTOR 2019 - Proceedings of the 12th ACM International Systems and Storage Conference
SP - 97
EP - 108
BT - SYSTOR 2019 - Proceedings of the 12th ACM International Systems and Storage Conference
PB - Association for Computing Machinery, Inc
Y2 - 3 June 2019 through 5 June 2019
ER -