Studying patterns and predictors of HIV viral suppression using A Big Data approach: a research protocol

Jiajia Zhang, Bankole Olatosi, Xueying Yang, Sharon Weissman, Zhenlong Li, Jianjun Hu, Xiaoming Li

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


Background: Given the importance of viral suppression in ending the HIV epidemic in the US and elsewhere, an optimal predictive model of viral status can help clinicians identify those at risk of poor viral control and inform clinical improvements in HIV treatment and care. With an increasing availability of electronic health record (EHR) data and social environmental information, there is a unique opportunity to improve our understanding of the dynamic pattern of viral suppression. Using a statewide cohort of people living with HIV (PLWH) in South Carolina (SC), the overall goal of the proposed research is to examine the dynamic patterns of viral suppression, develop optimal predictive models of various viral suppression indicators, and translate the models to a beta version of service-ready tools for clinical decision support. Methods: The PLWH cohort will be identified through the SC Enhanced HIV/AIDS Reporting System (eHARS). The SC Office of Revenue and Fiscal Affairs (RFA) will extract longitudinal EHR clinical data of all PLWH in SC from multiple health systems, obtain data from other state agencies, and link the patient-level data with county-level data from multiple publicly available data sources. Using the deidentified data, the proposed study will consist of three operational phases: Phase 1: “Pattern Analysis” to identify the longitudinal dynamics of viral suppression using multiple viral load indicators; Phase 2: “Model Development” to determine the critical predictors of multiple viral load indicators through artificial intelligence (AI)-based modeling accounting for multilevel factors; and Phase 3: “Translational Research” to develop a multifactorial clinical decision system based on a risk prediction model to assist with the identification of the risk of viral failure or viral rebound when patients present at clinical visits. Discussion: With both extensive data integration and data analytics, the proposed research will: (1) improve the understanding of the complex inter-related effects of longitudinal trajectories of HIV viral suppressions and HIV treatment history while taking into consideration multilevel factors; and (2) develop empirical public health approaches to achieve ending the HIV epidemic through translating the risk prediction model to a multifactorial decision system that enables the feasibility of AI-assisted clinical decisions.

Original languageEnglish (US)
Article number122
JournalBMC Infectious Diseases
Issue number1
StatePublished - Dec 2022

All Science Journal Classification (ASJC) codes

  • Infectious Diseases

Cite this