TY - GEN
T1 - A Large-Scale Dataset of Interactions Between Weibo Users and Platform-Empowered LLM Agent
AU - Gu, Shaokui
AU - Yin, Yongjie
AU - Gong, Qingyuan
AU - Tong, Fenghua
AU - Zhou, Yipeng
AU - Duan, Qiang
AU - Chen, Yang
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/11/10
Y1 - 2025/11/10
N2 - We release a large-scale dataset that captures interactions between human users and CommentRobert, an LLM-based social media agent on Weibo. The dataset contains Weibo posts in which users actively mention the LLM agent account @CommentRobert, indicating that the users are interested in interacting with the platform-empowered LLM agent. The dataset contains 557,645 interactions from 304,400 unique users over 17 months. We detail our data collection methodology, user attributes, and content characteristics, underscoring the dataset's value in examining real-world human-LLM agent interactions. Our analysis offers insights into the demographic and behavioral traits of users interested in the selected LLM agent, interaction dynamics between humans and the agent, and linguistic patterns in comments. These interactions provide a unique lens through which to explore how humans perceive, trust, and communicate with LLMs. This dataset enables further research into modeling human intent understanding, improving LLM agent design, and studying the evolution of human-LLM agent relationships. Potential applications also include long-term user engagement prediction and AI-generated comment detection on social platforms. This constructed dataset is available at https://zenodo.org/records/16921462.
AB - We release a large-scale dataset that captures interactions between human users and CommentRobert, an LLM-based social media agent on Weibo. The dataset contains Weibo posts in which users actively mention the LLM agent account @CommentRobert, indicating that the users are interested in interacting with the platform-empowered LLM agent. The dataset contains 557,645 interactions from 304,400 unique users over 17 months. We detail our data collection methodology, user attributes, and content characteristics, underscoring the dataset's value in examining real-world human-LLM agent interactions. Our analysis offers insights into the demographic and behavioral traits of users interested in the selected LLM agent, interaction dynamics between humans and the agent, and linguistic patterns in comments. These interactions provide a unique lens through which to explore how humans perceive, trust, and communicate with LLMs. This dataset enables further research into modeling human intent understanding, improving LLM agent design, and studying the evolution of human-LLM agent relationships. Potential applications also include long-term user engagement prediction and AI-generated comment detection on social platforms. This constructed dataset is available at https://zenodo.org/records/16921462.
UR - https://www.scopus.com/pages/publications/105023169300
UR - https://www.scopus.com/pages/publications/105023169300#tab=citedBy
U2 - 10.1145/3746252.3761607
DO - 10.1145/3746252.3761607
M3 - Conference contribution
AN - SCOPUS:105023169300
T3 - CIKM 2025 - Proceedings of the 34th ACM International Conference on Information and Knowledge Management
SP - 6392
EP - 6396
BT - CIKM 2025 - Proceedings of the 34th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery, Inc
T2 - 34th ACM International Conference on Information and Knowledge Management, CIKM 2025
Y2 - 10 November 2025 through 14 November 2025
ER -