Utility-based control feedback in a digital library search engine: Cases in citeseerx

Jian Wu, Alexander Ororbia, Kyle Williams, Madian Khabsa, Zhaohui Wu, C. Lee Giles

Research output: Contribution to conferencePaperpeer-review

1 Scopus citations

Abstract

We describe a utility-based feedback control model and its applications within an open access digital library search engine - CiteSeerX, the new version of Cite-Seer. CiteSeerX leverages user-based feedback to correct metadata and reformulate the citation graph. New documents are automatically crawled using a focused crawler for indexing. Those documents that are ingested have their document URLs automatically inspected so as to provide feedback to a whitelist filter, which automatically selects high quality crawl seed URLs. The changing citation count plus the download history of papers is an indicator of ill-conditioned metadata that needs correction. We believe that these feedback mechanisms effectively improve the overall metadata quality and save computational resources. Although these mechanisms are used in the context of CiteSeerX, we believe they can be readily transferred to other similar systems.

Original languageEnglish (US)
StatePublished - 2014
Event9th International Workshop on Feedback Computing - Philadelphia, United States
Duration: Jun 17 2014Jun 20 2014

Conference

Conference9th International Workshop on Feedback Computing
Country/TerritoryUnited States
CityPhiladelphia
Period6/17/146/20/14

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications
  • Software
  • Artificial Intelligence
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Utility-based control feedback in a digital library search engine: Cases in citeseerx'. Together they form a unique fingerprint.

Cite this