TY - JOUR
T1 - GeoCorpora
T2 - building a corpus to test and train microblog geoparsers
AU - Wallgrün, Jan Oliver
AU - Karimzadeh, Morteza
AU - MacEachren, Alan M.
AU - Pezanowski, Scott
N1 - Publisher Copyright:
© 2017 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2018/1/2
Y1 - 2018/1/2
N2 - In this article, we present the GeoCorpora corpus building framework and software tools as well as a geo-annotated Twitter corpus built with these tools to foster research and development in the areas of microblog/Twitter geoparsing and geographic information retrieval. The developed framework employs crowdsourcing and geovisual analytics to support the construction of large corpora of text in which the mentioned location entities are identified and geolocated to toponyms in existing geographical gazetteers. We describe how the approach has been applied to build a corpus of geo-annotated tweets that will be made freely available to the research community alongside this article to support the evaluation, comparison and training of geoparsers. Additionally, we report lessons learned related to corpus construction for geoparsing as well as insights about the notions of place and natural spatial language that we derive from application of the framework to building this corpus.
AB - In this article, we present the GeoCorpora corpus building framework and software tools as well as a geo-annotated Twitter corpus built with these tools to foster research and development in the areas of microblog/Twitter geoparsing and geographic information retrieval. The developed framework employs crowdsourcing and geovisual analytics to support the construction of large corpora of text in which the mentioned location entities are identified and geolocated to toponyms in existing geographical gazetteers. We describe how the approach has been applied to build a corpus of geo-annotated tweets that will be made freely available to the research community alongside this article to support the evaluation, comparison and training of geoparsers. Additionally, we report lessons learned related to corpus construction for geoparsing as well as insights about the notions of place and natural spatial language that we derive from application of the framework to building this corpus.
UR - http://www.scopus.com/inward/record.url?scp=85029432852&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85029432852&partnerID=8YFLogxK
U2 - 10.1080/13658816.2017.1368523
DO - 10.1080/13658816.2017.1368523
M3 - Article
AN - SCOPUS:85029432852
SN - 1365-8816
VL - 32
SP - 1
EP - 29
JO - International Journal of Geographical Information Science
JF - International Journal of Geographical Information Science
IS - 1
ER -