A sparse gaussian processes classification framework for fast tag suggestions

Yang Song, Lu Zhang, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

34 Scopus citations

Abstract

Tagged data is rapidly becoming more available on theWorld Wide Web. Web sites which populate tagging services offer a good way for Internet users to share their knowledge. An interesting problem is how to make tag suggestions when a new resource becomes available. In this paper, we address the issue of efficient tag suggestion. We first propose a multi-class sparse Gaussian process classification framework (SGPS) which is capable of classifying data with very few training instances. We suggest a novel prototype selection algorithm to select the best subset of points for model learning. The framework is then extended to a novel multi-class multi-label classification algorithm (MMSG) that transforms tag suggestion into the problem of multi-label ranking. Experiments on bench-mark data sets and real-world data from Del.icio.us and BibSonomy suggest that our model can greatly improve the performance of tag suggestions when compared to the state-of-the-art. Overall, our model requires linear time to train and constant time to predict per case. The memory consumption is also significantly less than traditional batch learning algorithms such as SVMs. In addition, results on tagging digital data also demonstrate that our model is capable of recommending relevant tags to images and videos by using their surrounding textual information.

Original languageEnglish (US)
Title of host publicationProceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08
Pages93-102
Number of pages10
DOIs
StatePublished - 2008
Event17th ACM Conference on Information and Knowledge Management, CIKM'08 - Napa Valley, CA, United States
Duration: Oct 26 2008Oct 30 2008

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other17th ACM Conference on Information and Knowledge Management, CIKM'08
Country/TerritoryUnited States
CityNapa Valley, CA
Period10/26/0810/30/08

All Science Journal Classification (ASJC) codes

  • General Business, Management and Accounting
  • General Decision Sciences

Fingerprint

Dive into the research topics of 'A sparse gaussian processes classification framework for fast tag suggestions'. Together they form a unique fingerprint.

Cite this