TY - GEN
T1 - Discretionary caching for I/O on clusters
AU - Vilayannur, Murali
AU - Sivasubramaniam, Anand
AU - Kandemir, Mahmut
AU - Thakur, Rajeev
AU - Ross, Robert
PY - 2003
Y1 - 2003
N2 - I/O bottlenecks are already a problem in many largescale applications that manipulate huge datasets. This problem is expected to get worse as applications get larger, and the I/O subsystem performance lags behind processor and memory speed improvements. Caching I/O blocks is one effective way of alleviating disk latencies, and there can be multiple levels of caching on a cluster of workstations. Previous studies have shown the benefits of caching whether it be local to a particular node, or a shared global cache across the cluster - for certain applications. However, we show that while caching is useful in some situations, it can hurt performance if we are not careful about what to cache and when to bypass the cache. This paper presents compilation techniques and runtime support to address this problem. These techniques are implemented and evaluated on an experimental Linux/Pentium cluster running a parallel file system. Our results using a diverse set of applications (scientific and commercial) demonstrate the benefits of a discretionary approach to caching for I/O subsystems on clusters, providing as much as 33% savings over indiscriminately caching everything in some applications.
AB - I/O bottlenecks are already a problem in many largescale applications that manipulate huge datasets. This problem is expected to get worse as applications get larger, and the I/O subsystem performance lags behind processor and memory speed improvements. Caching I/O blocks is one effective way of alleviating disk latencies, and there can be multiple levels of caching on a cluster of workstations. Previous studies have shown the benefits of caching whether it be local to a particular node, or a shared global cache across the cluster - for certain applications. However, we show that while caching is useful in some situations, it can hurt performance if we are not careful about what to cache and when to bypass the cache. This paper presents compilation techniques and runtime support to address this problem. These techniques are implemented and evaluated on an experimental Linux/Pentium cluster running a parallel file system. Our results using a diverse set of applications (scientific and commercial) demonstrate the benefits of a discretionary approach to caching for I/O subsystems on clusters, providing as much as 33% savings over indiscriminately caching everything in some applications.
UR - http://www.scopus.com/inward/record.url?scp=46449127520&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=46449127520&partnerID=8YFLogxK
U2 - 10.1109/CCGRID.2003.1199357
DO - 10.1109/CCGRID.2003.1199357
M3 - Conference contribution
AN - SCOPUS:46449127520
SN - 0769519199
SN - 9780769519197
T3 - Proceedings - CCGrid 2003: 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid
SP - 96
EP - 103
BT - Proceedings - CCGrid 2003
T2 - 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGrid 2003
Y2 - 12 May 2003 through 15 May 2003
ER -