TY - GEN
T1 - WorkloadCompactor
T2 - 2017 Symposium on Cloud Computing, SoCC 2017
AU - Zhu, Timothy
AU - Kozuch, Michael A.
AU - Harchol-Balter, Mor
N1 - Funding Information:
This research is supported in part by Intel as part of the Intel Science and Technology Center for Cloud Computing (ISTC-CC), by a Google Faculty Research Award 2015/16, by a Facebook Faculty Research Award 2015/16, and by the National Science Foundation under awards CMMI-1538204, CMMI-1334194, CSR-1116282, and XPS-1629444. We also thank the member companies of the PDL Consortium for their interest, insights, feedback, and support.
Publisher Copyright:
© 2017 Association for Computing Machinery.
PY - 2017/9/24
Y1 - 2017/9/24
N2 - Service providers want to reduce datacenter costs by consolidating workloads onto fewer servers. At the same time, customers have performance goals, such as meeting tail latency Service Level Objectives (SLOs). Consolidating workloads while meeting tail latency goals is challenging, especially since workloads in production environments are often bursty. To limit the congestion when consolidating workloads, customers and service providers often agree upon rate limits. Ideally, rate limits are chosen to maximize the number of workloads that can be co-located while meeting each workload's SLO. In reality, neither the service provider nor customer knows how to choose rate limits. Customers end up selecting rate limits on their own in some ad hoc fashion, and service providers are left to optimize given the chosen rate limits. This paper describes WorkloadCompactor, a new system that uses workload traces to automatically choose rate limits simultaneously with selecting onto which server to place workloads. Our system meets customer tail latency SLOs while minimizing datacenter resource costs. Our experiments show that by optimizing the choice of rate limits, WorkloadCompactor reduces the number of required servers by 30-60% as compared to state-of-the-art approaches.
AB - Service providers want to reduce datacenter costs by consolidating workloads onto fewer servers. At the same time, customers have performance goals, such as meeting tail latency Service Level Objectives (SLOs). Consolidating workloads while meeting tail latency goals is challenging, especially since workloads in production environments are often bursty. To limit the congestion when consolidating workloads, customers and service providers often agree upon rate limits. Ideally, rate limits are chosen to maximize the number of workloads that can be co-located while meeting each workload's SLO. In reality, neither the service provider nor customer knows how to choose rate limits. Customers end up selecting rate limits on their own in some ad hoc fashion, and service providers are left to optimize given the chosen rate limits. This paper describes WorkloadCompactor, a new system that uses workload traces to automatically choose rate limits simultaneously with selecting onto which server to place workloads. Our system meets customer tail latency SLOs while minimizing datacenter resource costs. Our experiments show that by optimizing the choice of rate limits, WorkloadCompactor reduces the number of required servers by 30-60% as compared to state-of-the-art approaches.
UR - http://www.scopus.com/inward/record.url?scp=85032442336&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85032442336&partnerID=8YFLogxK
U2 - 10.1145/3127479.3132245
DO - 10.1145/3127479.3132245
M3 - Conference contribution
AN - SCOPUS:85032442336
T3 - SoCC 2017 - Proceedings of the 2017 Symposium on Cloud Computing
SP - 598
EP - 610
BT - SoCC 2017 - Proceedings of the 2017 Symposium on Cloud Computing
PB - Association for Computing Machinery, Inc
Y2 - 24 September 2017 through 27 September 2017
ER -