Building a social media-based HIV risk behavior index to inform the prediction of HIV new diagnosis: A feasibility study

Zhenlong Li, Shan Qiao, Yuqin Jiang, Xiaoming Li

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Objective:Analysis of geolocation-based social media Big Data provides unprecedented opportunities for a broad range of domains including health as health is intrinsically linked to the geographic characteristics of places. HIV infection is largely driven by HIV risk behaviors, such as unsafe sexual behavior and drug abuse/addiction. This study explores the feasibility of building a Social media-based HIV Risk Behavior (SRB) index at the United States county level for informing HIV surveillance and prevention, considering social determinants of health and geographic locations.Methods:The SRB index, defined as the proportion of risk behavior related Twitter users among all Twitter users, was calculated at the county level for each year. To evaluate the performance of the new SRB index, the relationships between the county-level SRB and rate of new HIV diagnoses from AIDSVu were analyzed using multivariate regression while simultaneously considering five socioeconomic status (SES) factors (percentage uninsured, median household income, Gini coefficient, percentage living in poverty, percentage high school graduates) in the model. Moran's I and geographically weighted regression analyses (GWR) were leveraged to examine spatial autocorrelations and reveal the potential spatial heterogeneity (geographical variability) of the associations.Results:County-level multivariate regression results revealed that SRB has the strongest association with new HIV diagnosis rate (r > 0.36; P < 0.0001) in both years compared with the five SES factors. Hierarchical regression analysis suggested that the SRB index explains significant additional variance in addition to the five SES factors. The results from GWR analysis not only greatly improved the model explanation power (bringing the adjusted r-square from 0.25 to 0.47 in 2016 and 0.26 to 0.55 in 2017) but also revealed SRB index is the most spatially consistent measurement compared with the five SES factors in terms of impact direction (negative or positive correlation).Conclusion:It is feasible to build a social media-based HIV risk behavior index (SRB) as a new indicator for HIV surveillance at county level. The SRB index improves the regression model explanation power of new HIV diagnosis by providing additional information beyond the traditional social determinant measures, such as SES indicators. SRB index will allow researchers to utilize data captured within existing social media platforms to better understand the geospatial patterns of HIV risk behavior and to inform population-based HIV surveillance and other efforts of HIV prevention and control.

Original languageEnglish (US)
Pages (from-to)S91-S99
StatePublished - May 1 2021

All Science Journal Classification (ASJC) codes

  • Immunology and Allergy
  • Immunology
  • Infectious Diseases

Cite this