TY - GEN
T1 - Predicting web cache behavior using stochastic state-space models
AU - Das, Amitayu
AU - Datta, Ritendra
AU - Urgaonkar, Bhuvan
AU - Sivasubramaniam, Anand
PY - 2008
Y1 - 2008
N2 - Accurate analytical models of Web caches are desirable as they can provide inexpensive ways to make resource provisioning decisions at a cache itself as well as at the Web servers it is servicing. Explicitly modeling a Web cache has two major shortcomings: (i) several simplifying assumptions about the operation of the cache for mathematical tractability resulting in loss of accuracy and (ii) measurements of phenomena internal to the cache that may not always be available without adding monitoring hooks within the cache. Therefore, in this paper, we turn towards statistical techniques to develop a model that is non-intrusive (that is, requires no additions to the cache) and treats the Web cache as a black-box (that is, operates solely by observing readily available inputs/outputs and requires no knowledge about the internals of the cache). Relying on the intuition that the internal dynamics ofa cache can be captured by a first-order time-dependent process, we develop a model called SMCP, based on the well-studied linear Gaussian state space model, to observe, characterize, and predict the hit rates at a Web cache. A comparison with time-independent models, including one based on Linear Regression (LR), validates our intuition for the need to employ a time-dependent model. A detailed evaluation shows the efficacy of our model with LRU and LFU, two representative cache replacement policies. In our experiments, SMCP predicts hit ratio within 0.1 (absolute value) of their actual value 77.5% and 65% of the times for LRU and LFU, respectively. Secondly, SMCP captures the time- varying behavior more accurately than done by several time-independent models.
AB - Accurate analytical models of Web caches are desirable as they can provide inexpensive ways to make resource provisioning decisions at a cache itself as well as at the Web servers it is servicing. Explicitly modeling a Web cache has two major shortcomings: (i) several simplifying assumptions about the operation of the cache for mathematical tractability resulting in loss of accuracy and (ii) measurements of phenomena internal to the cache that may not always be available without adding monitoring hooks within the cache. Therefore, in this paper, we turn towards statistical techniques to develop a model that is non-intrusive (that is, requires no additions to the cache) and treats the Web cache as a black-box (that is, operates solely by observing readily available inputs/outputs and requires no knowledge about the internals of the cache). Relying on the intuition that the internal dynamics ofa cache can be captured by a first-order time-dependent process, we develop a model called SMCP, based on the well-studied linear Gaussian state space model, to observe, characterize, and predict the hit rates at a Web cache. A comparison with time-independent models, including one based on Linear Regression (LR), validates our intuition for the need to employ a time-dependent model. A detailed evaluation shows the efficacy of our model with LRU and LFU, two representative cache replacement policies. In our experiments, SMCP predicts hit ratio within 0.1 (absolute value) of their actual value 77.5% and 65% of the times for LRU and LFU, respectively. Secondly, SMCP captures the time- varying behavior more accurately than done by several time-independent models.
UR - http://www.scopus.com/inward/record.url?scp=62749182877&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=62749182877&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:62749182877
SN - 1601320841
SN - 9781601320841
T3 - Proceedings of the 2008 International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 2008
SP - 609
EP - 616
BT - Proceedings of the 2008 International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 2008
T2 - 2008 International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 2008
Y2 - 14 July 2008 through 17 July 2008
ER -