TY - JOUR
T1 - A multi-modal approach towards mining social media data during natural disasters - A case study of Hurricane Irma
AU - Mohanty, Somya D.
AU - Biggers, Brown
AU - Sayedahmed, Saed
AU - Pourebrahim, Nastaran
AU - Goldstein, Evan B.
AU - Bunch, Rick
AU - Chi, Guangqing
AU - Sadri, Fereidoon
AU - McCoy, Tom P.
AU - Cosby, Arthur
N1 - Publisher Copyright:
© 2021 Elsevier Ltd
PY - 2021/2/15
Y1 - 2021/2/15
N2 - Streaming social media provides a real-time glimpse of extreme weather impacts. However, the volume of streaming data makes mining information a challenge for emergency managers, policy makers, and disciplinary scientists. Here we explore the effectiveness of data learned approaches to mine and filter information from streaming social media data from Hurricane Irma's landfall in Florida, USA. We use 54,383 Twitter messages (out of 784 K geolocated messages) from 16,598 users from Sept. 10–12, 2017 to develop 4 independent models to filter data for relevance: 1) a geospatial model based on forcing conditions at the place and time of each tweet, 2) an image classification model for tweets that include images, 3) a user model to predict the reliability of the tweeter, and 4) a text model to determine if the text is related to Hurricane Irma. All four models are independently tested, and can be combined to quickly filter and visualize tweets based on user-defined thresholds for each submodel. We envision that this type of filtering and visualization routine can be useful as a base model for data capture from noisy sources such as Twitter. The data can then be subsequently used by policy makers, environmental managers, emergency managers, and domain scientists interested in finding tweets with specific attributes to use during different stages of the disaster (e.g., preparedness, response, and recovery), or for detailed research.
AB - Streaming social media provides a real-time glimpse of extreme weather impacts. However, the volume of streaming data makes mining information a challenge for emergency managers, policy makers, and disciplinary scientists. Here we explore the effectiveness of data learned approaches to mine and filter information from streaming social media data from Hurricane Irma's landfall in Florida, USA. We use 54,383 Twitter messages (out of 784 K geolocated messages) from 16,598 users from Sept. 10–12, 2017 to develop 4 independent models to filter data for relevance: 1) a geospatial model based on forcing conditions at the place and time of each tweet, 2) an image classification model for tweets that include images, 3) a user model to predict the reliability of the tweeter, and 4) a text model to determine if the text is related to Hurricane Irma. All four models are independently tested, and can be combined to quickly filter and visualize tweets based on user-defined thresholds for each submodel. We envision that this type of filtering and visualization routine can be useful as a base model for data capture from noisy sources such as Twitter. The data can then be subsequently used by policy makers, environmental managers, emergency managers, and domain scientists interested in finding tweets with specific attributes to use during different stages of the disaster (e.g., preparedness, response, and recovery), or for detailed research.
UR - http://www.scopus.com/inward/record.url?scp=85099619918&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099619918&partnerID=8YFLogxK
U2 - 10.1016/j.ijdrr.2020.102032
DO - 10.1016/j.ijdrr.2020.102032
M3 - Article
C2 - 33542893
AN - SCOPUS:85099619918
SN - 2212-4209
VL - 54
JO - International Journal of Disaster Risk Reduction
JF - International Journal of Disaster Risk Reduction
M1 - 102032
ER -