Imaginary people representing real numbers: Generating personas from online social media data

J. An, H. Kwak, S. Jung, J. Salminen, M. Admad, B. Jansen

Research output: Contribution to journalArticlepeer-review

79 Scopus citations


We develop a methodology to automate creating imaginary people, referred to as personas, by processing complex behavioral and demographic data of social media audiences. From a popular social media account containing more than 30 million interactions by viewers from 198 countries engaging with more than 4,200 online videos produced by a global media corporation, we demonstrate that our methodology has several novel accomplishments, including: (a) identifying distinct user behavioral segments based on the user content consumption patterns; (b) identifying impactful demographics groupings; and (c) creating rich persona descriptions by automatically adding pertinent attributes, such as names, photos, and personal characteristics. We validate our approach by implementing the methodology into an actual working system; we then evaluate it via quantitative methods by examining the accuracy of predicting content preference of personas, the stability of the personas over time, and the generalizability of the method via applying to two other datasets. Research findings show the approach can develop rich personas representing the behavior and demographics of real audiences using privacy-preserving aggregated online social media data from major online platforms. Results have implications for media companies and other organizations distributing content via online platforms.

Original languageEnglish (US)
Article number27
JournalACM Transactions on the Web
Issue number4
StatePublished - Nov 2018

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications


Dive into the research topics of 'Imaginary people representing real numbers: Generating personas from online social media data'. Together they form a unique fingerprint.

Cite this