TY - GEN
T1 - Identifying multi-regime behaviors of memes in Twitter data
AU - Griffin, Christopher
AU - Squicciarini, Anna C.
AU - Styer, Steven
N1 - Publisher Copyright:
© 2014 The Science and Information (SAI) Organization.
PY - 2014/10/7
Y1 - 2014/10/7
N2 - Recent work has studied Twitter's role in distributing information about specific events, in acting as a platform for political debate, and in facilitating social interaction. Despite this interesting body of work, to our knowledge, it is unclear how trending words are used in Twitter, and what is their lifecycle. In this work, we investigate statistical models of the dynamics of word/phrase use in Twitter over time. We identify four base behaviors, derived from the autocorrelation functions of the frequency of word/phrase use. We then observe drift among these base behaviors in our sampled word/phrases over multiple weeks. To the best of our knowledge, this is the first time a hybrid statistical model using Markov processes and ARIMA sub-models have been used to explain the occurrence of certain n-grams within the linguistic space of Twitter topics. The ultimate objective of this work is to develop a hierarchical model for the behavior of word/phrase occurrence within Twitter. The model supposes that words/phrase dynamics move from one regime to another as various exogenous forces act on the population of users. This paper takes the first steps in illustrating that these regimes exist and shows some of the dynamics of regime change.
AB - Recent work has studied Twitter's role in distributing information about specific events, in acting as a platform for political debate, and in facilitating social interaction. Despite this interesting body of work, to our knowledge, it is unclear how trending words are used in Twitter, and what is their lifecycle. In this work, we investigate statistical models of the dynamics of word/phrase use in Twitter over time. We identify four base behaviors, derived from the autocorrelation functions of the frequency of word/phrase use. We then observe drift among these base behaviors in our sampled word/phrases over multiple weeks. To the best of our knowledge, this is the first time a hybrid statistical model using Markov processes and ARIMA sub-models have been used to explain the occurrence of certain n-grams within the linguistic space of Twitter topics. The ultimate objective of this work is to develop a hierarchical model for the behavior of word/phrase occurrence within Twitter. The model supposes that words/phrase dynamics move from one regime to another as various exogenous forces act on the population of users. This paper takes the first steps in illustrating that these regimes exist and shows some of the dynamics of regime change.
UR - http://www.scopus.com/inward/record.url?scp=84909594882&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84909594882&partnerID=8YFLogxK
U2 - 10.1109/SAI.2014.6918281
DO - 10.1109/SAI.2014.6918281
M3 - Conference contribution
AN - SCOPUS:84909594882
T3 - Proceedings of 2014 Science and Information Conference, SAI 2014
SP - 827
EP - 837
BT - Proceedings of 2014 Science and Information Conference, SAI 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 Science and Information Conference, SAI 2014
Y2 - 27 August 2014 through 29 August 2014
ER -