TY - JOUR
T1 - Prediction of likes and retweets using text information retrieval
AU - Daga, Ishita
AU - Gupta, Anchal
AU - Vardhan, Raj
AU - Mukherjee, Partha
N1 - Publisher Copyright:
© 2020 The Authors.
PY - 2020
Y1 - 2020
N2 - Twitter is one of the major social media platforms today to study human behaviours by analysing their interactions. To ensure popularity of the tweet, the focus should be on the content of the tweet that results in numerous followings of that message with sufficient number of likes and retweets. The high quality of tweets, increases the online reputation of the users who post it. If a user can get the prediction of likes and retweets on his text before posting it on the internet, it would improve the popularity of the tweet from information sharing perspective. In this paper we employed different machine learning classifiers like SVM, Naïve Bayes, Logistic Regression, Random Forest, and Neural Network, on top of two different text processing approaches used in NLP (natural language processing), namely bag-of-words (TFIDF) and word embeddings (Doc2Vec), to check how many likes and retweets can a tweet generate. The results obtained indicate that all the models performed 10-15% better with the bagof-word technique.
AB - Twitter is one of the major social media platforms today to study human behaviours by analysing their interactions. To ensure popularity of the tweet, the focus should be on the content of the tweet that results in numerous followings of that message with sufficient number of likes and retweets. The high quality of tweets, increases the online reputation of the users who post it. If a user can get the prediction of likes and retweets on his text before posting it on the internet, it would improve the popularity of the tweet from information sharing perspective. In this paper we employed different machine learning classifiers like SVM, Naïve Bayes, Logistic Regression, Random Forest, and Neural Network, on top of two different text processing approaches used in NLP (natural language processing), namely bag-of-words (TFIDF) and word embeddings (Doc2Vec), to check how many likes and retweets can a tweet generate. The results obtained indicate that all the models performed 10-15% better with the bagof-word technique.
UR - http://www.scopus.com/inward/record.url?scp=85093123038&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85093123038&partnerID=8YFLogxK
U2 - 10.1016/j.procs.2020.02.273
DO - 10.1016/j.procs.2020.02.273
M3 - Conference article
AN - SCOPUS:85093123038
SN - 1877-0509
VL - 168
SP - 123
EP - 128
JO - Procedia Computer Science
JF - Procedia Computer Science
T2 - 2020 Complex Adaptive Systems Conference, CAS 2019
Y2 - 13 November 2019 through 15 November 2019
ER -