TY - JOUR
T1 - Mitigating Prediction Error of Deep Learning Streamflow Models in Large Data-Sparse Regions With Ensemble Modeling and Soft Data
AU - Feng, Dapeng
AU - Lawson, Kathryn
AU - Shen, Chaopeng
N1 - Funding Information:
This work was supported by US National Science Foundation Award EAR#1832294. K. Lawson and C. Shen have financial interests in HydroSapient, Inc., a company which could potentially benefit from the results of this research. This interest has been reviewed by the University in accordance with its Individual Conflict of Interest policy, for the purpose of maintaining the objectivity and the integrity of research at The Pennsylvania State University.
Funding Information:
This work was supported by US National Science Foundation Award EAR#1832294. K. Lawson and C. Shen have financial interests in HydroSapient, Inc., a company which could potentially benefit from the results of this research. This interest has been reviewed by the University in accordance with its Individual Conflict of Interest policy, for the purpose of maintaining the objectivity and the integrity of research at The Pennsylvania State University.
Publisher Copyright:
© 2021. American Geophysical Union. All Rights Reserved.
PY - 2021/7/28
Y1 - 2021/7/28
N2 - Predicting discharge in contiguously data-scarce or ungauged regions is needed for quantifying the global hydrologic cycle. We show that prediction in ungauged regions (PUR) has major, underrecognized uncertainty and is drastically more difficult than previous problems where basins can be represented by neighboring or similar basins (known as prediction in ungauged basins). While deep neural networks demonstrated stellar performance for streamflow predictions, performance nonetheless declined for PUR, benchmarked here with a new stringent region-based holdout test on a US data set with 671 basins. We tested approaches to reduce such errors, leveraging deep network's flexibility to integrate “soft” data, such as satellite-based soil moisture product, or daily flow distributions which improved low flow simulations. A novel input-selection ensemble improved average performance and greatly reduced catastrophic failures. Despite challenges, deep networks showed stronger performance metrics for PUR than traditional hydrologic models. They appear competitive for geoscientific modeling even in data-scarce settings.
AB - Predicting discharge in contiguously data-scarce or ungauged regions is needed for quantifying the global hydrologic cycle. We show that prediction in ungauged regions (PUR) has major, underrecognized uncertainty and is drastically more difficult than previous problems where basins can be represented by neighboring or similar basins (known as prediction in ungauged basins). While deep neural networks demonstrated stellar performance for streamflow predictions, performance nonetheless declined for PUR, benchmarked here with a new stringent region-based holdout test on a US data set with 671 basins. We tested approaches to reduce such errors, leveraging deep network's flexibility to integrate “soft” data, such as satellite-based soil moisture product, or daily flow distributions which improved low flow simulations. A novel input-selection ensemble improved average performance and greatly reduced catastrophic failures. Despite challenges, deep networks showed stronger performance metrics for PUR than traditional hydrologic models. They appear competitive for geoscientific modeling even in data-scarce settings.
UR - http://www.scopus.com/inward/record.url?scp=85111550366&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85111550366&partnerID=8YFLogxK
U2 - 10.1029/2021GL092999
DO - 10.1029/2021GL092999
M3 - Article
AN - SCOPUS:85111550366
SN - 0094-8276
VL - 48
JO - Geophysical Research Letters
JF - Geophysical Research Letters
IS - 14
M1 - e2021GL092999
ER -