TY - GEN
T1 - FemaRepViz
T2 - VAST IEEE Symposium on Visual Analytics Science and Technology 2007
AU - Pan, Chi Chun
AU - Mitra, Prasenjit
PY - 2007
Y1 - 2007
N2 - An architecture for visualizing information extracted from text documents is proposed. In conformance with this architecture, a toolkit, FemaRepViz, has been implemented to extract and visualize temporal, geospatial, and summarized information from FEMA National Update Reports. Preliminary tests have shown satisfactory accuracy for FEMARepViz. A central component of the architecture is an entity extractor that extracts named entities like person names, location names, temporal references, etc. FEMARepViz is based on FactXtractor, an entity-extractor that works on text documents. The information extracted using FactXtractor is processed using GeoTagger, a geographical name disambiguation tool based on a novel clustering-based disambiguation algorithm. To extract relationships among entities, we propose a machine-learning based algorithm that uses a novel stripped dependency tree kernel. We illustrate and evaluate the usefulness of our system on the FEMA National Situation Updates. Daily reports are fetched by FEMARepViz from the FEMA website, segmented into coherent sections and each section is classified into one of several known incident types. We use ConceptVista, Google Maps and Google Earth to visualize the events extracted from the text reports and allow the user to interactively filter the topics, locations, and time-periods of interest to create a visual analytics toolkit that is useful for rapid analysis of events reported in a large set of text documents.
AB - An architecture for visualizing information extracted from text documents is proposed. In conformance with this architecture, a toolkit, FemaRepViz, has been implemented to extract and visualize temporal, geospatial, and summarized information from FEMA National Update Reports. Preliminary tests have shown satisfactory accuracy for FEMARepViz. A central component of the architecture is an entity extractor that extracts named entities like person names, location names, temporal references, etc. FEMARepViz is based on FactXtractor, an entity-extractor that works on text documents. The information extracted using FactXtractor is processed using GeoTagger, a geographical name disambiguation tool based on a novel clustering-based disambiguation algorithm. To extract relationships among entities, we propose a machine-learning based algorithm that uses a novel stripped dependency tree kernel. We illustrate and evaluate the usefulness of our system on the FEMA National Situation Updates. Daily reports are fetched by FEMARepViz from the FEMA website, segmented into coherent sections and each section is classified into one of several known incident types. We use ConceptVista, Google Maps and Google Earth to visualize the events extracted from the text reports and allow the user to interactively filter the topics, locations, and time-periods of interest to create a visual analytics toolkit that is useful for rapid analysis of events reported in a large set of text documents.
UR - http://www.scopus.com/inward/record.url?scp=47349093410&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47349093410&partnerID=8YFLogxK
U2 - 10.1109/VAST.2007.4388991
DO - 10.1109/VAST.2007.4388991
M3 - Conference contribution
AN - SCOPUS:47349093410
SN - 9781424416592
T3 - VAST IEEE Symposium on Visual Analytics Science and Technology 2007, Proceedings
SP - 11
EP - 18
BT - VAST IEEE Symposium on Visual Analytics Science and Technology 2007, Proceedings
Y2 - 30 October 2007 through 1 November 2007
ER -