TY - GEN
T1 - A semantics-based approach to disclosure classification in user-generated online content
AU - Akiti, Chandan
AU - Squicciarini, Anna
AU - Rajtmajer, Sarah
N1 - Publisher Copyright:
© 2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - As users engage in public discourse, the rate of voluntarily disclosed personal information has seen a steep increase. So-called self-disclosure can result in a number of privacy concerns. Users are often unaware of the sheer amount of personal information they share across online forums, commentaries, and social networks, as well as the power of modern AI to synthesize and gain insights from this data. This paper presents an approach to detect emotional and informational self-disclosure in natural language.We hypothesize that identifying frame semantics can meaningfully support this task. Specifically, we use Semantic Role Labeling to identify the lexical units and their semantic roles that signal self-disclosure. Experimental results on Reddit data show the performance gain of our method when compared to standard text classification methods based on BiLSTM, and BERT. In addition to improved performance, our approach provides insights into the drivers of disclosure behaviors.
AB - As users engage in public discourse, the rate of voluntarily disclosed personal information has seen a steep increase. So-called self-disclosure can result in a number of privacy concerns. Users are often unaware of the sheer amount of personal information they share across online forums, commentaries, and social networks, as well as the power of modern AI to synthesize and gain insights from this data. This paper presents an approach to detect emotional and informational self-disclosure in natural language.We hypothesize that identifying frame semantics can meaningfully support this task. Specifically, we use Semantic Role Labeling to identify the lexical units and their semantic roles that signal self-disclosure. Experimental results on Reddit data show the performance gain of our method when compared to standard text classification methods based on BiLSTM, and BERT. In addition to improved performance, our approach provides insights into the drivers of disclosure behaviors.
UR - http://www.scopus.com/inward/record.url?scp=85118436541&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85118436541&partnerID=8YFLogxK
U2 - 10.18653/v1/2020.findings-emnlp.312
DO - 10.18653/v1/2020.findings-emnlp.312
M3 - Conference contribution
AN - SCOPUS:85118436541
T3 - Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020
SP - 3490
EP - 3499
BT - Findings of the Association for Computational Linguistics Findings of ACL
PB - Association for Computational Linguistics (ACL)
T2 - Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020
Y2 - 16 November 2020 through 20 November 2020
ER -