TY - GEN
T1 - Distilling Knowledge on Text Graph for Social Media Attribute Inference
AU - Li, Quan
AU - Li, Xiaoting
AU - Chen, Lingwei
AU - Wu, Dinghao
N1 - Funding Information:
The work was supported in part by a seed grant from the Penn State Center for Security Research and Education (CSRE).
Publisher Copyright:
© 2022 ACM.
PY - 2022/7/6
Y1 - 2022/7/6
N2 - The popularization of social media generates a large amount of user-oriented data, where text data especially attracts researchers and speculators to infer user attributes (e.g., age, gender) for fulfilling their intents. Generally, this line of work casts attribute inference as a text classification problem, and starts to leverage graph neural networks for higher-level text representations. However, these text graphs are constructed on words, suffering from high memory consumption and ineffectiveness on few labeled texts. To address this challenge, we design a text-graph-based few-shot learning model for social media attribute inferences. Our model builds a text graph with texts as nodes and edges learned from current text representations via manifold learning and message passing. To further use unlabeled texts to improve few-shot performance, a knowledge distillation is devised to optimize the problem. This offers a trade-off between expressiveness and complexity. Experiments on social media datasets demonstrate the state-of-the-art performance of our model on attribute inferences with considerably fewer labeled texts.
AB - The popularization of social media generates a large amount of user-oriented data, where text data especially attracts researchers and speculators to infer user attributes (e.g., age, gender) for fulfilling their intents. Generally, this line of work casts attribute inference as a text classification problem, and starts to leverage graph neural networks for higher-level text representations. However, these text graphs are constructed on words, suffering from high memory consumption and ineffectiveness on few labeled texts. To address this challenge, we design a text-graph-based few-shot learning model for social media attribute inferences. Our model builds a text graph with texts as nodes and edges learned from current text representations via manifold learning and message passing. To further use unlabeled texts to improve few-shot performance, a knowledge distillation is devised to optimize the problem. This offers a trade-off between expressiveness and complexity. Experiments on social media datasets demonstrate the state-of-the-art performance of our model on attribute inferences with considerably fewer labeled texts.
UR - http://www.scopus.com/inward/record.url?scp=85135078520&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85135078520&partnerID=8YFLogxK
U2 - 10.1145/3477495.3531968
DO - 10.1145/3477495.3531968
M3 - Conference contribution
AN - SCOPUS:85135078520
T3 - SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
SP - 2024
EP - 2028
BT - SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
PB - Association for Computing Machinery, Inc
T2 - 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022
Y2 - 11 July 2022 through 15 July 2022
ER -