TY - GEN
T1 - Automatic summary generation for scientific data charts
AU - Al-Zaidy, Rabah A.
AU - Choudhury, Sagnik Ray
AU - Giles, Clyde Lee
N1 - Publisher Copyright:
Copyright © 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2016
Y1 - 2016
N2 - Scientific charts in the web, whether as images or embedded in digital documents, contain valuable information that is not fully available to information retrieval tools. The information used to describe these charts is typically extracted from the image metadata rather than the information the graphic was initially designed to express. The problem of understanding digital charts found in scholarly documents, and inferring useful textual information from their graphical components is the focus of this study. We present an approach to automatically read the chart data, specifically bar charts, and provide the user with a textual summary of the chart. The proposed method follows a knowledge discovery approach that relies on a versatile graph representation of the chart. This representation is derived from analyzing a chart's original data values, from which useful features are extracted. The data features are in turn used to construct a semantic-graph. To generate a summary, the semantic-graph of the chart is mapped to appropriately crafted protoforms, which are constructs based on fuzzy logic. We verify the effectiveness of our framework by conducting experiments on bar charts extracted from over 1,000 PDF documents. Our preliminary results show that, under certain assumptions, 83% of the produced summaries provide plausible descriptions of the bar charts.
AB - Scientific charts in the web, whether as images or embedded in digital documents, contain valuable information that is not fully available to information retrieval tools. The information used to describe these charts is typically extracted from the image metadata rather than the information the graphic was initially designed to express. The problem of understanding digital charts found in scholarly documents, and inferring useful textual information from their graphical components is the focus of this study. We present an approach to automatically read the chart data, specifically bar charts, and provide the user with a textual summary of the chart. The proposed method follows a knowledge discovery approach that relies on a versatile graph representation of the chart. This representation is derived from analyzing a chart's original data values, from which useful features are extracted. The data features are in turn used to construct a semantic-graph. To generate a summary, the semantic-graph of the chart is mapped to appropriately crafted protoforms, which are constructs based on fuzzy logic. We verify the effectiveness of our framework by conducting experiments on bar charts extracted from over 1,000 PDF documents. Our preliminary results show that, under certain assumptions, 83% of the produced summaries provide plausible descriptions of the bar charts.
UR - http://www.scopus.com/inward/record.url?scp=84974575964&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84974575964&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84974575964
T3 - AAAI Workshop - Technical Report
SP - 658
EP - 663
BT - WS-16-01
PB - AI Access Foundation
T2 - 30th AAAI Conference on Artificial Intelligence, AAAI 2016
Y2 - 12 February 2016 through 17 February 2016
ER -