A class of models for large zero-inflated spatial data

Ben Seiyon Lee, Murali Haran

Research output: Contribution to journalArticlepeer-review


Spatially correlated data with an excess of zeros, usually referred to as zero-inflated spatial data, arise in many disciplines. Examples include count data, for instance, abundance (or lack thereof) of animal species and disease counts, as well as semi-continuous data like observed precipitation. Spatial two-part models are a flexible class of models for such data. Fitting two-part models can be computationally expensive for large data due to high-dimensional dependent latent variables, costly matrix operations, and slow mixing Markov chains. We describe a flexible, computationally efficient approach for modeling large zero-inflated spatial data using the projection-based intrinsic conditional autoregression (PICAR) framework. We study our approach, which we call PICAR-Z, through extensive simulation studies and two environmental data sets. Our results suggest that PICAR-Z provides accurate predictions while remaining computationally efficient. An important goal of our work is to allow researchers who are not experts in computation to easily build computationally efficient extensions to zero-inflated spatial models; this also allows for a more thorough exploration of modeling choices in two-part models than was previously possible. We show that PICAR-Z is easy to implement and extend in popular probabilistic programming languages such as nimble and stan.

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • General Environmental Science
  • Agricultural and Biological Sciences (miscellaneous)
  • General Agricultural and Biological Sciences
  • Statistics, Probability and Uncertainty
  • Applied Mathematics

Cite this