Searching for tables in digital documents

Ying Liu, Kun Bai, Prasenjit Mitra, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

Tables are ubiquitous. In scientific documents, tables are widely used to present experimental results or statistical data in a condensed fashion. Current search engines do not allow the end-user to search for relevant tables. In this paper, we describe TableSeer, an automatic table extraction and search engine system. TableSeer crawls scientific documents, identifies documents with tables, extracts tables from documents, indexes them and enables end-users to search for tables. We also propose an extensive set of mediumindependent metadata for tables representation. Given a query, TableSeer ranks the returned results using an innovative ranking algorithm - TableRank. Our results show that TableSeer outperforms popular search engines, such as Google Scholar when the end-user seeks for tables.

Original languageEnglish (US)
Title of host publicationProceedings - 9th International Conference on Document Analysis and Recognition, ICDAR 2007
Pages934-938
Number of pages5
DOIs
StatePublished - 2007
Event9th International Conference on Document Analysis and Recognition, ICDAR 2007 - Curitiba, Brazil
Duration: Sep 23 2007Sep 26 2007

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
Volume2
ISSN (Print)1520-5363

Other

Other9th International Conference on Document Analysis and Recognition, ICDAR 2007
Country/TerritoryBrazil
CityCuritiba
Period9/23/079/26/07

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Searching for tables in digital documents'. Together they form a unique fingerprint.

Cite this