Machine Learning Models for Prediction of Shade-Affected Stream TemperaturesSource: Journal of Hydrologic Engineering:;2025:;Volume ( 030 ):;issue: 001::page 04024058-1DOI: 10.1061/JHYEFF.HEENG-6227Publisher: American Society of Civil Engineers
Abstract: This study evaluates the selection of predictor variables for machine learning (ML) models of stream temperatures affected by riparian shading. Two scenarios of model development were examined—ML models for the prediction of shade-affected stream temperatures at monitored locations (i.e., temporal prediction) and ML models for the prediction of shade-affected stream temperatures at unmonitored locations (i.e., spatial prediction) using data from the watershed’s nearby monitored gages. Notably, this study goes beyond previous research by examining the inclusion of riparian vegetation as a predictor variable alongside other commonly assessed predictors. The ultimate goal was to identify the optimal number of predictors as well as key predictors for developing efficient ML models in terms of accuracy and complexity. Two stations in the Dairy McKay Watershed (DMW) with long-term stream-temperature records were used to develop the temporal prediction ML models. For spatial prediction ML models, data from 29 monitored sites along the DMW were utilized. A total of 33 variables were analyzed, but variables exhibiting no temporal variation were excluded from temporal prediction ML models. The study revealed that ML models achieved reliable predictions with fewer predictors, reducing feature needs. Results showed that for temporal prediction ML models, a subset of four predictors (air temperature, stream flow, day of the year, and shade factor) resulted in the simplest model, with average RMSE values ranging between 0.78 and 0.88. For spatial prediction ML models, the simplest model consisted of six predictors excluding the shade factor, achieving an average RMSE of 0.54. Including riparian vegetation as a key predictor in temporal prediction ML models enabled us to assess its impact on mitigating stream-temperature rise. Thus, evaluation of three riparian vegetation scenarios using temporal prediction ML models showed their significant impact on lowering stream temperatures, aligning with physically based models. This study focuses on selecting the best predictor variables to estimate stream temperatures influenced by riparian shade using machine learning (ML) models. The research evaluated ML models for predicting stream temperatures at both monitored sites (temporal prediction) and unmonitored sites using nearby data (spatial prediction). A key aspect was including riparian vegetation as a predictor along with other common variables to improve model accuracy and simplicity. Data from two long-term monitoring stations in the Dairy McKay Watershed were used for temporal predictions, while data from 29 sites were used for spatial predictions. Thirty-three variables were assessed, covering climate, watershed and stream characteristics, seasonality, and site location. The study found that ML models can achieve accurate predictions with fewer predictors. For temporal predictions, the best model used four predictors: air temperature, stream flow, day of the year, and shade factor. Including riparian vegetation significantly helped in reducing stream temperatures, as shown in temporal models, supporting its importance in managing stream-temperature rise. For spatial predictions, the simplest model used six predictors, excluding the shade factor.
|
Collections
Show full item record
contributor author | Efrain Noa-Yarasca | |
contributor author | Meghna Babbar-Sebens | |
contributor author | Chris E. Jordan | |
date accessioned | 2025-04-20T10:31:30Z | |
date available | 2025-04-20T10:31:30Z | |
date copyright | 11/29/2024 12:00:00 AM | |
date issued | 2025 | |
identifier other | JHYEFF.HEENG-6227.pdf | |
identifier uri | http://yetl.yabesh.ir/yetl1/handle/yetl/4304888 | |
description abstract | This study evaluates the selection of predictor variables for machine learning (ML) models of stream temperatures affected by riparian shading. Two scenarios of model development were examined—ML models for the prediction of shade-affected stream temperatures at monitored locations (i.e., temporal prediction) and ML models for the prediction of shade-affected stream temperatures at unmonitored locations (i.e., spatial prediction) using data from the watershed’s nearby monitored gages. Notably, this study goes beyond previous research by examining the inclusion of riparian vegetation as a predictor variable alongside other commonly assessed predictors. The ultimate goal was to identify the optimal number of predictors as well as key predictors for developing efficient ML models in terms of accuracy and complexity. Two stations in the Dairy McKay Watershed (DMW) with long-term stream-temperature records were used to develop the temporal prediction ML models. For spatial prediction ML models, data from 29 monitored sites along the DMW were utilized. A total of 33 variables were analyzed, but variables exhibiting no temporal variation were excluded from temporal prediction ML models. The study revealed that ML models achieved reliable predictions with fewer predictors, reducing feature needs. Results showed that for temporal prediction ML models, a subset of four predictors (air temperature, stream flow, day of the year, and shade factor) resulted in the simplest model, with average RMSE values ranging between 0.78 and 0.88. For spatial prediction ML models, the simplest model consisted of six predictors excluding the shade factor, achieving an average RMSE of 0.54. Including riparian vegetation as a key predictor in temporal prediction ML models enabled us to assess its impact on mitigating stream-temperature rise. Thus, evaluation of three riparian vegetation scenarios using temporal prediction ML models showed their significant impact on lowering stream temperatures, aligning with physically based models. This study focuses on selecting the best predictor variables to estimate stream temperatures influenced by riparian shade using machine learning (ML) models. The research evaluated ML models for predicting stream temperatures at both monitored sites (temporal prediction) and unmonitored sites using nearby data (spatial prediction). A key aspect was including riparian vegetation as a predictor along with other common variables to improve model accuracy and simplicity. Data from two long-term monitoring stations in the Dairy McKay Watershed were used for temporal predictions, while data from 29 sites were used for spatial predictions. Thirty-three variables were assessed, covering climate, watershed and stream characteristics, seasonality, and site location. The study found that ML models can achieve accurate predictions with fewer predictors. For temporal predictions, the best model used four predictors: air temperature, stream flow, day of the year, and shade factor. Including riparian vegetation significantly helped in reducing stream temperatures, as shown in temporal models, supporting its importance in managing stream-temperature rise. For spatial predictions, the simplest model used six predictors, excluding the shade factor. | |
publisher | American Society of Civil Engineers | |
title | Machine Learning Models for Prediction of Shade-Affected Stream Temperatures | |
type | Journal Article | |
journal volume | 30 | |
journal issue | 1 | |
journal title | Journal of Hydrologic Engineering | |
identifier doi | 10.1061/JHYEFF.HEENG-6227 | |
journal fristpage | 04024058-1 | |
journal lastpage | 04024058-14 | |
page | 14 | |
tree | Journal of Hydrologic Engineering:;2025:;Volume ( 030 ):;issue: 001 | |
contenttype | Fulltext |