Exploring Social Annotations for Information Retrieval

Ding Zhou, Jiang Bian, Shuyi Zheng, Hongyuan Zha, Lee Giles C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

147 Scopus citations

Abstract

Social annotation has gained increasing popularity in many Web-based applications, leading to an emerging research area in text analysis and information retrieval. This paper is concerned with developing probabilistic models and computational algorithms for social annotations. We propose a unified framework to combine the modeling of social annotations with the language modeling-based methods for information retrieval. The proposed approach consists of two steps: (1) discovering topics in the contents and annotations of documents while categorizing the users by domains; and (2) enhancing document and query language models by incorporating user domain interests as well as topical background models. In particular, we propose a new general generative model for social annotations, which is then simplified to a computationally tractable hierarchical Bayesian network. Then we apply smoothing techniques in a risk minimization framework to incorporate the topical information to language models. Experiments are carried out on a real-world annotation data set sampled from del.icio.us. Our results demonstrate significant improvements over traditional approaches.

Original languageEnglish (US)
Title of host publicationProceeding of the 17th International Conference on World Wide Web 2008, WWW'08
Pages715-724
Number of pages10
DOIs
StatePublished - 2008
Event17th International Conference on World Wide Web 2008, WWW'08 - Beijing, China
Duration: Apr 21 2008Apr 25 2008

Publication series

NameProceeding of the 17th International Conference on World Wide Web 2008, WWW'08

Other

Other17th International Conference on World Wide Web 2008, WWW'08
Country/TerritoryChina
CityBeijing
Period4/21/084/25/08

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

Cite this