TY - JOUR
T1 - Reference data enhancement for geographic information retrieval using linked data
AU - Moura, Tiago H V M
AU - Davis, Clodoveu A.
AU - Fonseca, Frederico T.
N1 - Publisher Copyright:
© 2016 John Wiley & Sons Ltd
PY - 2017/8
Y1 - 2017/8
N2 - Gazetteers are instrumental in recognizing place names in documents such as Web pages, news, and social media messages. However, creating and maintaining gazetteers is still a complex task. Even though some online gazetteers provide rich sets of geographic names in planetary scale (e.g. GeoNames), other sources must be used to recognize references to urban locations, such as street names, neighborhood names or landmarks. We propose integrating Linked Data sources to create a gazetteer that combines a broad coverage of places with urban detail, including content on geographic and semantic relationships involving places, their multiple names and related non-geographic entities. Our final goal is to expand the possibilities for recognizing, disambiguating and filtering references to places in texts for geographic information retrieval (GIR) and related applications. The resulting ontological gazetteer, named LoG (Linked OntoGazetteer), is accessible through Web services by applications and research initiatives on GIR, text processing, named entity recognition and others. The gazetteer currently contains over 13 million places, 140 million attributes and relationships, and 4.5 million non-geographic entities. Data sources include GeoNames, Freebase, DBPedia and LinkedGeoData, which is based on OpenStreetMap data. An analysis on how these datasets overlap and complement one another is also presented.
AB - Gazetteers are instrumental in recognizing place names in documents such as Web pages, news, and social media messages. However, creating and maintaining gazetteers is still a complex task. Even though some online gazetteers provide rich sets of geographic names in planetary scale (e.g. GeoNames), other sources must be used to recognize references to urban locations, such as street names, neighborhood names or landmarks. We propose integrating Linked Data sources to create a gazetteer that combines a broad coverage of places with urban detail, including content on geographic and semantic relationships involving places, their multiple names and related non-geographic entities. Our final goal is to expand the possibilities for recognizing, disambiguating and filtering references to places in texts for geographic information retrieval (GIR) and related applications. The resulting ontological gazetteer, named LoG (Linked OntoGazetteer), is accessible through Web services by applications and research initiatives on GIR, text processing, named entity recognition and others. The gazetteer currently contains over 13 million places, 140 million attributes and relationships, and 4.5 million non-geographic entities. Data sources include GeoNames, Freebase, DBPedia and LinkedGeoData, which is based on OpenStreetMap data. An analysis on how these datasets overlap and complement one another is also presented.
UR - http://www.scopus.com/inward/record.url?scp=84996484141&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84996484141&partnerID=8YFLogxK
U2 - 10.1111/tgis.12238
DO - 10.1111/tgis.12238
M3 - Article
AN - SCOPUS:84996484141
SN - 1361-1682
VL - 21
SP - 683
EP - 700
JO - Transactions in GIS
JF - Transactions in GIS
IS - 4
ER -