TY - GEN
T1 - TraceSplitter
T2 - 16th European Conference on Computer Systems, EuroSys 2021
AU - Sajal, Sultan Mahmud
AU - Hasan, Rubaba
AU - Zhu, Timothy
AU - Urgaonkar, Bhuvan
AU - Sen, Siddhartha
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/4/21
Y1 - 2021/4/21
N2 - Realistic experimentation is a key component of systems research and industry prototyping, but experimental clusters are often too small to replay the high traffic rates found in production traces. Thus, it is often necessary to downscale traces to lower their arrival rate, and researchers/practitioners generally do this in an ad-hoc manner. For example, one practice is to multiply all arrival timestamps in a trace by a scaling factor to spread the load across a longer timespan. However, temporal patterns are skewed by this approach, which may lead to inappropriate conclusions about some system properties (e.g., the agility of auto-scaling). Another popular approach is to count the number of arrivals in fixed-sized time intervals and scale it according to some modeling assumptions. However, such approaches can eliminate or exaggerate the fine-grained burstiness in the trace depending on the time interval length. The goal of this paper is to demonstrate the drawbacks of common downscaling techniques and propose new methods for realistically downscaling traces. We introduce a new paradigm for scaling traces that splits an original trace into multiple downscaled traces to accurately capture the characteristics of the original trace. Our key insight is that production traces are often generated by a cluster of service instances sitting behind a load balancer; by mimicking the load balancing used to split load across these instances, we can similarly split the production trace in a manner that captures the workload experienced by each service instance. Using production traces, synthetic traces, and a case study of an auto-scaling system, we identify and evaluate a variety of scenarios that show how our approach is superior to current approaches.
AB - Realistic experimentation is a key component of systems research and industry prototyping, but experimental clusters are often too small to replay the high traffic rates found in production traces. Thus, it is often necessary to downscale traces to lower their arrival rate, and researchers/practitioners generally do this in an ad-hoc manner. For example, one practice is to multiply all arrival timestamps in a trace by a scaling factor to spread the load across a longer timespan. However, temporal patterns are skewed by this approach, which may lead to inappropriate conclusions about some system properties (e.g., the agility of auto-scaling). Another popular approach is to count the number of arrivals in fixed-sized time intervals and scale it according to some modeling assumptions. However, such approaches can eliminate or exaggerate the fine-grained burstiness in the trace depending on the time interval length. The goal of this paper is to demonstrate the drawbacks of common downscaling techniques and propose new methods for realistically downscaling traces. We introduce a new paradigm for scaling traces that splits an original trace into multiple downscaled traces to accurately capture the characteristics of the original trace. Our key insight is that production traces are often generated by a cluster of service instances sitting behind a load balancer; by mimicking the load balancing used to split load across these instances, we can similarly split the production trace in a manner that captures the workload experienced by each service instance. Using production traces, synthetic traces, and a case study of an auto-scaling system, we identify and evaluate a variety of scenarios that show how our approach is superior to current approaches.
UR - http://www.scopus.com/inward/record.url?scp=85105355153&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85105355153&partnerID=8YFLogxK
U2 - 10.1145/3447786.3456262
DO - 10.1145/3447786.3456262
M3 - Conference contribution
AN - SCOPUS:85105355153
T3 - EuroSys 2021 - Proceedings of the 16th European Conference on Computer Systems
SP - 606
EP - 619
BT - EuroSys 2021 - Proceedings of the 16th European Conference on Computer Systems
PB - Association for Computing Machinery, Inc
Y2 - 26 April 2021 through 28 April 2021
ER -