TY - GEN
T1 - File systems unfit as distributed storage backends
T2 - 27th ACM Symposium on Operating Systems Principles, SOSP 2019
AU - Aghayev, Abutalib
AU - Weil, Sage
AU - Kuchnik, Michael
AU - Nelson, Mark
AU - Ganger, Gregory R.
AU - Amvrosiadis, George
N1 - Funding Information:
Abutalib Aghayev is supported by an SOSP 2019 student scholarship from the National Science Foundation. Michael Kuchnik is supported by an SOSP 2019 student scholarship from the ACM Special Interest Group in Operating Systems and by an NDSEG Fellowship.
Publisher Copyright:
© 2019 Copyright held by the owner/author(s).
PY - 2019/10/27
Y1 - 2019/10/27
N2 - For a decade, the Ceph distributed file system followed the conventional wisdom of building its storage backend on top of local file systems. This is a preferred choice for most distributed file systems today because it allows them to benefit from the convenience and maturity of battle-tested code. Ceph’s experience, however, shows that this comes at a high price. First, developing a zero-overhead transaction mechanism is challenging. Second, metadata performance at the local level can significantly affect performance at the distributed level. Third, supporting emerging storage hardware is painstakingly slow. Ceph addressed these issues with BlueStore, a new backend designed to run directly on raw storage devices. In only two years since its inception, BlueStore outperformed previous established backends and is adopted by 70% of users in production. By running in user space and fully controlling the I/O stack, it has enabled space-efficient metadata and data checksums, fast overwrites of erasure-coded data, inline compression, decreased performance variability, and avoided a series of performance pitfalls of local file systems. Finally, it makes the adoption of backwards-incompatible storage hardware possible, an important trait in a changing storage landscape that is learning to embrace hardware diversity.
AB - For a decade, the Ceph distributed file system followed the conventional wisdom of building its storage backend on top of local file systems. This is a preferred choice for most distributed file systems today because it allows them to benefit from the convenience and maturity of battle-tested code. Ceph’s experience, however, shows that this comes at a high price. First, developing a zero-overhead transaction mechanism is challenging. Second, metadata performance at the local level can significantly affect performance at the distributed level. Third, supporting emerging storage hardware is painstakingly slow. Ceph addressed these issues with BlueStore, a new backend designed to run directly on raw storage devices. In only two years since its inception, BlueStore outperformed previous established backends and is adopted by 70% of users in production. By running in user space and fully controlling the I/O stack, it has enabled space-efficient metadata and data checksums, fast overwrites of erasure-coded data, inline compression, decreased performance variability, and avoided a series of performance pitfalls of local file systems. Finally, it makes the adoption of backwards-incompatible storage hardware possible, an important trait in a changing storage landscape that is learning to embrace hardware diversity.
UR - http://www.scopus.com/inward/record.url?scp=85076750752&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076750752&partnerID=8YFLogxK
U2 - 10.1145/3341301.3359656
DO - 10.1145/3341301.3359656
M3 - Conference contribution
AN - SCOPUS:85076750752
T3 - SOSP 2019 - Proceedings of the 27th ACM Symposium on Operating Systems Principles
SP - 353
EP - 369
BT - SOSP 2019 - Proceedings of the 27th ACM Symposium on Operating Systems Principles
PB - Association for Computing Machinery, Inc
Y2 - 27 October 2019 through 30 October 2019
ER -