TY - GEN
T1 - Harnessing Social Media to Identify Homeless Youth At-Risk of Substance Use
AU - Dou, Zi Yi
AU - Barman-Adhikari, Anamika
AU - Fang, Fei
AU - Yadav, Amulya
N1 - Funding Information:
Co-author Fang is supported in part by NSF grant IIS-1850477.
Publisher Copyright:
© 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2021
Y1 - 2021
N2 - Homeless youth are a highly vulnerable population and report highly elevated rates of substance use. Prior work on mitigating substance use among homeless youth has primarily relied on survey data to get information about substance use among homeless youth, which can then be used to inform the design of targeted intervention programs. However, such survey data is often onerous to collect, is limited by its reliance on selfreports and retrospective recall, and quickly becomes dated. The advent of social media has provided us with an important data source for understanding the health behaviors of homeless youth. In this paper, we target this specific population and demonstrate how to detect substance use based on texts from social media. We collect 135K Facebook posts and comments together with survey responses from a group of homeless youth and use this data to build novel substance use detection systems with machine learning and natural language processing techniques. Experimental results show that our proposed methods achieve ROC-AUC scores of 0.77 on identifying certain kinds of substance use among homeless youth using Facebook conversations only, and ROC-AUC scores of 0.83 when combined with answers to four survey questions that are not about their demographic characteristics or substance use. Furthermore, we investigate connections between the characteristics of people's Facebook posts and substance use and provide insights about the problem.
AB - Homeless youth are a highly vulnerable population and report highly elevated rates of substance use. Prior work on mitigating substance use among homeless youth has primarily relied on survey data to get information about substance use among homeless youth, which can then be used to inform the design of targeted intervention programs. However, such survey data is often onerous to collect, is limited by its reliance on selfreports and retrospective recall, and quickly becomes dated. The advent of social media has provided us with an important data source for understanding the health behaviors of homeless youth. In this paper, we target this specific population and demonstrate how to detect substance use based on texts from social media. We collect 135K Facebook posts and comments together with survey responses from a group of homeless youth and use this data to build novel substance use detection systems with machine learning and natural language processing techniques. Experimental results show that our proposed methods achieve ROC-AUC scores of 0.77 on identifying certain kinds of substance use among homeless youth using Facebook conversations only, and ROC-AUC scores of 0.83 when combined with answers to four survey questions that are not about their demographic characteristics or substance use. Furthermore, we investigate connections between the characteristics of people's Facebook posts and substance use and provide insights about the problem.
UR - http://www.scopus.com/inward/record.url?scp=85130095616&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85130095616&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85130095616
T3 - 35th AAAI Conference on Artificial Intelligence, AAAI 2021
SP - 14748
EP - 14756
BT - 35th AAAI Conference on Artificial Intelligence, AAAI 2021
PB - Association for the Advancement of Artificial Intelligence
T2 - 35th AAAI Conference on Artificial Intelligence, AAAI 2021
Y2 - 2 February 2021 through 9 February 2021
ER -