Adaptive removal of background and white space from document images using seam categorization

Claude Fillion, Zhigang Fan, Vishal Monga

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Document images are obtained regularly by rasterization of document content and as scans of printed documents. Resizing via background and white space removal is often desired for better consumption of these images, whether on displays or in print. While white space and background are easy to identify in images, existing methods such as naïve removal and content aware resizing (seam carving) each have limitations that can lead to undesirable artifacts, such as uneven spacing between lines of text or poor arrangement of content. An adaptive method based on image content is hence needed. In this paper we propose an adaptive method to intelligently remove white space and background content from document images. Document images are different from pictorial images in structure. They typically contain objects (text letters, pictures and graphics) separated by uniform background, which include both white paper space and other uniform color background. Pixels in uniform background regions are excellent candidates for deletion if resizing is required, as they introduce less change in document content and style, compared with deletion of object pixels. We propose a background deletion method that exploits both local and global context. The method aims to retain the document structural information and image quality.

Original languageEnglish (US)
Title of host publicationImaging and Printing in a Web 2.0 World II
DOIs
StatePublished - 2011
EventImaging and Printing in a Web 2.0 World II - San Francisco, CA, United States
Duration: Jan 26 2011Jan 27 2011

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume7879
ISSN (Print)0277-786X

Other

OtherImaging and Printing in a Web 2.0 World II
Country/TerritoryUnited States
CitySan Francisco, CA
Period1/26/111/27/11

All Science Journal Classification (ASJC) codes

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Computer Science Applications
  • Applied Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Adaptive removal of background and white space from document images using seam categorization'. Together they form a unique fingerprint.

Cite this