Deriving knowledge from figures for digital libraries

Xiaonan Lu, James Z. Wang, Prasenjit Mitra, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

Figures in digital documents contain important information. Current digital libraries do not summarize and index information available within figures for document retrieval. We present our system on automatic categorization of figures and extraction of data from 2-D plots. A machine-learning based method is used to categorize figures into a set of predefined types based on image features. An automated algorithm is designed to extract data values from solid line curves in 2-D plots. The semantic type of figures and extracted data values from 2-D plots can be integrated with textual information within documents to provide more effective document retrieval services for digital library users. Experimental evaluation has demonstrated that our system can produce results suitable for real-world use.

Original languageEnglish (US)
Title of host publication16th International World Wide Web Conference, WWW2007
Pages1229-1230
Number of pages2
DOIs
StatePublished - 2007
Event16th International World Wide Web Conference, WWW2007 - Banff, AB, Canada
Duration: May 8 2007May 12 2007

Publication series

Name16th International World Wide Web Conference, WWW2007

Other

Other16th International World Wide Web Conference, WWW2007
Country/TerritoryCanada
CityBanff, AB
Period5/8/075/12/07

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'Deriving knowledge from figures for digital libraries'. Together they form a unique fingerprint.

Cite this