A multilevel logistic regression model for identifying the relevance of environmental risk factors on Gestational Diabetes Mellitus

  • Carolina Gonzalez-Canas
  • , Toyya A. Pujol
  • , Paul Griffin
  • , Zachary Hass

Research output: Contribution to journalArticlepeer-review

Abstract

The overarching goal of this research is to determine whether a woman's risk of developing Gestational Diabetes Mellitus (GDM) is affected by environmental factors. The importance of environmental factors is an important question for public health policy across many diseases. Moreover, this paper focuses on highlighting several methodological challenges specific to the types of data commonly available to address this and related research questions. Medicaid health insurance claims information for Indiana was used to identify pregnant women from the study period, an outcome variable of GDM and demographic control variables. The Medicaid location data was available at the region level (three digit ZIP code) and corresponding regional environmental factors were rolled up to the ZIP-3 level from public county data. We fit a multilevel logistic regression model (MLM) to account for the correlation caused by the clustering of women within the same regions. Model results generally align with known risk factors and additionally a region's racial makeup, number of birthing hospitals, food environment index, and amount of air pollution were found to be risk factors of GDM. This is, to the best of our knowledge, the first research that tests the association of multiple environmental factors with GDM. Despite the appropriateness of the model to the structure of the data, we see several challenges that must be overcome to realize the full utility of MLM using currently available data. (1) Some form of data triangulation is necessary to overcome false negatives in the outcome variable due to the use of health insurance claims. (2) A more favorable data use agreement is necessary to allow for more granular identification of patient location to avoid obscuring relationships between region level variables and GDM risk. (3) The impact of sample balancing on the inference of multilevel logistic model coefficients remains an open question.

Original languageEnglish (US)
Article number100152
JournalHealthcare Analytics
Volume3
DOIs
StatePublished - Nov 2023

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

All Science Journal Classification (ASJC) codes

  • Analytical Chemistry
  • Health Informatics

Fingerprint

Dive into the research topics of 'A multilevel logistic regression model for identifying the relevance of environmental risk factors on Gestational Diabetes Mellitus'. Together they form a unique fingerprint.

Cite this