TY - GEN
T1 - Robinhood
T2 - 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018
AU - Berger, Daniel S.
AU - Berg, Benjamin
AU - Zhu, Timothy
AU - Harchol-Balter, Mor
AU - Sen, Siddhartha
N1 - Funding Information:
We thank Jen Guriel, Bhavesh Thaker, Omprakash Maity, and everyone on the OneRF team at Microsoft. We also thank the anonymous reviewers, and our shepherd, Frans Kaashoek, for their feedback. This paper was supported by NSF-CSR-180341, NSF-XPS-1629444, NSF-CMMI-1538204, and a Faculty Award from Microsoft.
Publisher Copyright:
© Proceedings of NSDI 2010: 7th USENIX Symposium on Networked Systems Design and Implementation. All rights reserved.
PY - 2007
Y1 - 2007
N2 - Tail latency is of great importance in user-facing web services. However, maintaining low tail latency is challenging, because a single request to a web application server results in multiple queries to complex, diverse backend services (databases, recommender systems, ad systems, etc.). A request is not complete until all of its queries have completed. We analyze a Microsoft production system and find that backend query latencies vary by more than two orders of magnitude across backends and over time, resulting in high request tail latencies. We propose a novel solution for maintaining low request tail latency: repurpose existing caches to mitigate the effects of backend latency variability, rather than just caching popular data. Our solution, RobinHood, dynamically reallocates cache resources from the cache-rich (backends which don't affect request tail latency) to the cache-poor (backends which affect request tail latency). We evaluate RobinHood with production traces on a 50-server cluster with 20 different backend systems. Surprisingly, we find that RobinHood can directly address tail latency even if working sets are much larger than the cache size. In the presence of load spikes, RobinHood meets a 150ms P99 goal 99.7% of the time, whereas the next best policy meets this goal only 70% of the time.
AB - Tail latency is of great importance in user-facing web services. However, maintaining low tail latency is challenging, because a single request to a web application server results in multiple queries to complex, diverse backend services (databases, recommender systems, ad systems, etc.). A request is not complete until all of its queries have completed. We analyze a Microsoft production system and find that backend query latencies vary by more than two orders of magnitude across backends and over time, resulting in high request tail latencies. We propose a novel solution for maintaining low request tail latency: repurpose existing caches to mitigate the effects of backend latency variability, rather than just caching popular data. Our solution, RobinHood, dynamically reallocates cache resources from the cache-rich (backends which don't affect request tail latency) to the cache-poor (backends which affect request tail latency). We evaluate RobinHood with production traces on a 50-server cluster with 20 different backend systems. Surprisingly, we find that RobinHood can directly address tail latency even if working sets are much larger than the cache size. In the presence of load spikes, RobinHood meets a 150ms P99 goal 99.7% of the time, whereas the next best policy meets this goal only 70% of the time.
UR - http://www.scopus.com/inward/record.url?scp=85071135180&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071135180&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85071135180
T3 - Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018
SP - 195
EP - 212
BT - Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018
PB - USENIX Association
Y2 - 8 October 2018 through 10 October 2018
ER -