TY - JOUR
T1 - Workload analysis for scientific literature digital libraries
AU - Li, Huajing
AU - Lee, Wang Chien
AU - Sivasubramaniam, Anand
AU - Giles, C. Lee
PY - 2008/11/1
Y1 - 2008/11/1
N2 - Workload studies of large-scale systems may help locating possible bottlenecks and improving performances. However, previous workload analysis for Web applications is typically focused on generic platforms, neglecting the unique characteristics exhibited in various domains of these applications. It is observed that different application domains have intrinsically heterogeneous characteristics, which have a direct impact on the system performance. In this study, we present an extensive analysis into the workload of scientific literature digital libraries, unveiling their temporal and user interest patterns. Logs of a computer science literature digital library, CiteSeer, are collected and analyzed. We intentionally remove service details specific to CiteSeer. We believe our analysis is applicable to other systems with similar characteristics. While many of our findings are consistent with previous Web analysis, we discover several unique characteristics of scientific literature digital library workload. Furthermore, we discuss how to utilize our findings to improve system performance.
AB - Workload studies of large-scale systems may help locating possible bottlenecks and improving performances. However, previous workload analysis for Web applications is typically focused on generic platforms, neglecting the unique characteristics exhibited in various domains of these applications. It is observed that different application domains have intrinsically heterogeneous characteristics, which have a direct impact on the system performance. In this study, we present an extensive analysis into the workload of scientific literature digital libraries, unveiling their temporal and user interest patterns. Logs of a computer science literature digital library, CiteSeer, are collected and analyzed. We intentionally remove service details specific to CiteSeer. We believe our analysis is applicable to other systems with similar characteristics. While many of our findings are consistent with previous Web analysis, we discover several unique characteristics of scientific literature digital library workload. Furthermore, we discuss how to utilize our findings to improve system performance.
UR - http://www.scopus.com/inward/record.url?scp=56049091745&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=56049091745&partnerID=8YFLogxK
U2 - 10.1007/s00799-008-0043-z
DO - 10.1007/s00799-008-0043-z
M3 - Article
AN - SCOPUS:56049091745
SN - 1432-5012
VL - 9
SP - 139
EP - 149
JO - International Journal on Digital Libraries
JF - International Journal on Digital Libraries
IS - 2
ER -