Illuminating patterns and processes of water quality in U.S. rivers using physics-guided deep learning

Project: Research project

Project Details


Water quality problems are fundamental, universal challenges in society. Persistent nutrient pollution has caused eutrophication and harmful algal blooms globally, estimated to cost more than 4 billion dollars annually in the United States alone. Nutrient pollution threatens ecosystems and food production. Soil erosion will continue to grow with the global urban population. The United States has spent more than a trillion dollars to improve water quality since 1972, equivalent to annual spending of $100 per American, making clean water arguably one of the most expensive environmental investments, more than the cost of clean air. Understanding water quality dynamics is essential yet has remained a major challenge, partly due to its complex nature and data scarcity. This project aims to improve understanding of water quality dynamics by developing forecasting tools and advancing knowledge on how and why water quality changes under different conditions and places. The outcomes will help policymakers, water managers, and the broader public to make informed decisions that ensure the sustainability of water resources.Despite tremendous progress and efforts in the past decades, water quality measurements have remained arduous and expensive, leading to inconsistent data coverage. Understanding of water quality dynamics therefore is often limited to individual sites. The project aims to determine the patterns of and processes that regulate concentration-discharge relationships of water quality variables across the United States. The project will focus on common water quality variables, including nitrate, total phosphorus, and turbidity (a proxy for total suspended sediment). The project will test whether spatial patterns of concentration-discharge relationships are driven predominantly by land use (relative to other drivers) that regulates hydrological flow paths and source water biogeochemistry. The hypotheses will be tested using Process-Guided Deep Learning integrating traditional Long Short-Term Memory models with reactive transport models. The integration will address the limitations of data scarcity and the "black box" nature of deep learning models, and advance predictive accuracy. The project will also 1) make the reconstructed data publicly available; 2) share the trained models for prediction in unmonitored time, space and future scenarios; 3) create videos to educate stakeholders on how to use the models; and 4) broaden participation in the field of artificial intelligence/machine learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Effective start/end date3/15/242/28/27


  • National Science Foundation: $447,958.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.