TY - JOUR
T1 - From hydrometeorology to river water quality
T2 - Can a deep learning model predict dissolved oxygen at the continental scale?
AU - Zhi, Wei
AU - Feng, Dapeng
AU - Tsai, Wen Ping
AU - Sterle, Gary
AU - Harpold, Adrian
AU - Shen, Chaopeng
AU - Li, Li
N1 - Funding Information:
This study was supported by a seed grant from the Penn State Institute of Computation and Data Science, the U.S. Department of Energy (DOE) Subsurface Biogeochemical Research (SBR) program (DE-SC0016221), and the U.S. National Science Foundation grant EAR-1724171. A.A.H. and G.S. were supported by National Science Foundation grants (EAR 1723990 and EAR 1724171). The CAMELS-Chem DO data set is deposited at the GitHub repository at https://github.com/LiReactiveWater/CAMELS-Chem-DO-dataset . The hydrometeorological time-series data and watershed attributes are available at the CAMELS data Web site ( https://ral.ucar.edu/solutions/products/camels ). The deep-learning LSTM code is available from the GitHub at https://github.com/mhpi/hydroDL .
Publisher Copyright:
© 2020 American Chemical Society.
PY - 2021/2/16
Y1 - 2021/2/16
N2 - Dissolved oxygen (DO) reflects river metabolic pulses and is an essential water quality measure. Our capabilities of forecasting DO however remain elusive. Water quality data, specifically DO data here, often have large gaps and sparse areal and temporal coverage. Earth surface and hydrometeorology data, on the other hand, have become largely available. Here we ask: can a Long Short-Term Memory (LSTM) model learn about river DO dynamics from sparse DO and intensive (daily) hydrometeorology data? We used CAMELS-chem, a new data set with DO concentrations from 236 minimally disturbed watersheds across the U.S. The model generally learns the theory of DO solubility and captures its decreasing trend with increasing water temperature. It exhibits the potential of predicting DO in "chemically ungauged basins", defined as basins without any measurements of DO and broadly water quality in general. The model however misses some DO peaks and troughs when in-stream biogeochemical processes become important. Surprisingly, the model does not perform better where more data are available. Instead, it performs better in basins with low variations of streamflow and DO, high runoff-ratio (>0.45), and winter precipitation peaks. Results here suggest that more data collections at DO peaks and troughs and in sparsely monitored areas are essential to overcome the issue of data scarcity, an outstanding challenge in the water quality community.
AB - Dissolved oxygen (DO) reflects river metabolic pulses and is an essential water quality measure. Our capabilities of forecasting DO however remain elusive. Water quality data, specifically DO data here, often have large gaps and sparse areal and temporal coverage. Earth surface and hydrometeorology data, on the other hand, have become largely available. Here we ask: can a Long Short-Term Memory (LSTM) model learn about river DO dynamics from sparse DO and intensive (daily) hydrometeorology data? We used CAMELS-chem, a new data set with DO concentrations from 236 minimally disturbed watersheds across the U.S. The model generally learns the theory of DO solubility and captures its decreasing trend with increasing water temperature. It exhibits the potential of predicting DO in "chemically ungauged basins", defined as basins without any measurements of DO and broadly water quality in general. The model however misses some DO peaks and troughs when in-stream biogeochemical processes become important. Surprisingly, the model does not perform better where more data are available. Instead, it performs better in basins with low variations of streamflow and DO, high runoff-ratio (>0.45), and winter precipitation peaks. Results here suggest that more data collections at DO peaks and troughs and in sparsely monitored areas are essential to overcome the issue of data scarcity, an outstanding challenge in the water quality community.
UR - http://www.scopus.com/inward/record.url?scp=85100697125&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100697125&partnerID=8YFLogxK
U2 - 10.1021/acs.est.0c06783
DO - 10.1021/acs.est.0c06783
M3 - Article
C2 - 33533608
AN - SCOPUS:85100697125
SN - 0013-936X
VL - 55
SP - 2357
EP - 2368
JO - Environmental Science and Technology
JF - Environmental Science and Technology
IS - 4
ER -