TY - GEN
T1 - Exploring Social Annotations for Information Retrieval
AU - Zhou, Ding
AU - Bian, Jiang
AU - Zheng, Shuyi
AU - Zha, Hongyuan
AU - C. Lee Giles, Lee Giles
PY - 2008
Y1 - 2008
N2 - Social annotation has gained increasing popularity in many Web-based applications, leading to an emerging research area in text analysis and information retrieval. This paper is concerned with developing probabilistic models and computational algorithms for social annotations. We propose a unified framework to combine the modeling of social annotations with the language modeling-based methods for information retrieval. The proposed approach consists of two steps: (1) discovering topics in the contents and annotations of documents while categorizing the users by domains; and (2) enhancing document and query language models by incorporating user domain interests as well as topical background models. In particular, we propose a new general generative model for social annotations, which is then simplified to a computationally tractable hierarchical Bayesian network. Then we apply smoothing techniques in a risk minimization framework to incorporate the topical information to language models. Experiments are carried out on a real-world annotation data set sampled from del.icio.us. Our results demonstrate significant improvements over traditional approaches.
AB - Social annotation has gained increasing popularity in many Web-based applications, leading to an emerging research area in text analysis and information retrieval. This paper is concerned with developing probabilistic models and computational algorithms for social annotations. We propose a unified framework to combine the modeling of social annotations with the language modeling-based methods for information retrieval. The proposed approach consists of two steps: (1) discovering topics in the contents and annotations of documents while categorizing the users by domains; and (2) enhancing document and query language models by incorporating user domain interests as well as topical background models. In particular, we propose a new general generative model for social annotations, which is then simplified to a computationally tractable hierarchical Bayesian network. Then we apply smoothing techniques in a risk minimization framework to incorporate the topical information to language models. Experiments are carried out on a real-world annotation data set sampled from del.icio.us. Our results demonstrate significant improvements over traditional approaches.
UR - http://www.scopus.com/inward/record.url?scp=57349185892&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=57349185892&partnerID=8YFLogxK
U2 - 10.1145/1367497.1367594
DO - 10.1145/1367497.1367594
M3 - Conference contribution
AN - SCOPUS:57349185892
SN - 9781605580852
T3 - Proceeding of the 17th International Conference on World Wide Web 2008, WWW'08
SP - 715
EP - 724
BT - Proceeding of the 17th International Conference on World Wide Web 2008, WWW'08
T2 - 17th International Conference on World Wide Web 2008, WWW'08
Y2 - 21 April 2008 through 25 April 2008
ER -