TY - JOUR
T1 - Deep learning for real-time social media text classification for situation awareness–using Hurricanes Sandy, Harvey, and Irma as case studies
AU - Yu, Manzhu
AU - Huang, Qunying
AU - Qin, Han
AU - Scheele, Chris
AU - Yang, Chaowei
N1 - Publisher Copyright:
© 2019, © 2019 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2019/11/2
Y1 - 2019/11/2
N2 - Social media platforms have been contributing to disaster management during the past several years. Text mining solutions using traditional machine learning techniques have been developed to categorize the messages into different themes, such as caution and advice, to better understand the meaning and leverage useful information from the social media text content. However, these methods are mostly event specific and difficult to generalize for cross-event classifications. In other words, traditional classification models trained by historic datasets are not capable of categorizing social media messages from a future event. This research examines the capability of a convolutional neural network (CNN) model in cross-event Twitter topic classification based on three geo-tagged twitter datasets collected during Hurricanes Sandy, Harvey, and Irma. The performance of the CNN model is compared to two traditional machine learning methods: support vector machine (SVM) and logistic regression (LR). Experiment results showed that CNN models achieved a consistently better accuracy for both single event and cross-event evaluation scenarios whereas SVM and LR models had lower accuracy compared to their own single event accuracy results. This indicated that the CNN model has the capability of pre-training Twitter data from past events to classify for an upcoming event for situational awareness.
AB - Social media platforms have been contributing to disaster management during the past several years. Text mining solutions using traditional machine learning techniques have been developed to categorize the messages into different themes, such as caution and advice, to better understand the meaning and leverage useful information from the social media text content. However, these methods are mostly event specific and difficult to generalize for cross-event classifications. In other words, traditional classification models trained by historic datasets are not capable of categorizing social media messages from a future event. This research examines the capability of a convolutional neural network (CNN) model in cross-event Twitter topic classification based on three geo-tagged twitter datasets collected during Hurricanes Sandy, Harvey, and Irma. The performance of the CNN model is compared to two traditional machine learning methods: support vector machine (SVM) and logistic regression (LR). Experiment results showed that CNN models achieved a consistently better accuracy for both single event and cross-event evaluation scenarios whereas SVM and LR models had lower accuracy compared to their own single event accuracy results. This indicated that the CNN model has the capability of pre-training Twitter data from past events to classify for an upcoming event for situational awareness.
UR - https://www.scopus.com/pages/publications/85061431804
UR - https://www.scopus.com/pages/publications/85061431804#tab=citedBy
U2 - 10.1080/17538947.2019.1574316
DO - 10.1080/17538947.2019.1574316
M3 - Article
AN - SCOPUS:85061431804
SN - 1753-8947
VL - 12
SP - 1230
EP - 1247
JO - International Journal of Digital Earth
JF - International Journal of Digital Earth
IS - 11
ER -