Machine Learning Models for Prediction of Shade-Affected Stream Temperatures

Efrain Noa-Yarasca; Meghna Babbar-Sebens; Chris E. Jordan

Source: Journal of Hydrologic Engineering:;2025:;Volume ( 030 ):;issue: 001::page 04024058-1

Author:

Efrain Noa-Yarasca

Meghna Babbar-Sebens

Chris E. Jordan

DOI: 10.1061/JHYEFF.HEENG-6227

Publisher: American Society of Civil Engineers

Abstract: This study evaluates the selection of predictor variables for machine learning (ML) models of stream temperatures affected by riparian shading. Two scenarios of model development were examined—ML models for the prediction of shade-affected stream temperatures at monitored locations (i.e., temporal prediction) and ML models for the prediction of shade-affected stream temperatures at unmonitored locations (i.e., spatial prediction) using data from the watershed’s nearby monitored gages. Notably, this study goes beyond previous research by examining the inclusion of riparian vegetation as a predictor variable alongside other commonly assessed predictors. The ultimate goal was to identify the optimal number of predictors as well as key predictors for developing efficient ML models in terms of accuracy and complexity. Two stations in the Dairy McKay Watershed (DMW) with long-term stream-temperature records were used to develop the temporal prediction ML models. For spatial prediction ML models, data from 29 monitored sites along the DMW were utilized. A total of 33 variables were analyzed, but variables exhibiting no temporal variation were excluded from temporal prediction ML models. The study revealed that ML models achieved reliable predictions with fewer predictors, reducing feature needs. Results showed that for temporal prediction ML models, a subset of four predictors (air temperature, stream flow, day of the year, and shade factor) resulted in the simplest model, with average RMSE values ranging between 0.78 and 0.88. For spatial prediction ML models, the simplest model consisted of six predictors excluding the shade factor, achieving an average RMSE of 0.54. Including riparian vegetation as a key predictor in temporal prediction ML models enabled us to assess its impact on mitigating stream-temperature rise. Thus, evaluation of three riparian vegetation scenarios using temporal prediction ML models showed their significant impact on lowering stream temperatures, aligning with physically based models. This study focuses on selecting the best predictor variables to estimate stream temperatures influenced by riparian shade using machine learning (ML) models. The research evaluated ML models for predicting stream temperatures at both monitored sites (temporal prediction) and unmonitored sites using nearby data (spatial prediction). A key aspect was including riparian vegetation as a predictor along with other common variables to improve model accuracy and simplicity. Data from two long-term monitoring stations in the Dairy McKay Watershed were used for temporal predictions, while data from 29 sites were used for spatial predictions. Thirty-three variables were assessed, covering climate, watershed and stream characteristics, seasonality, and site location. The study found that ML models can achieve accurate predictions with fewer predictors. For temporal predictions, the best model used four predictors: air temperature, stream flow, day of the year, and shade factor. Including riparian vegetation significantly helped in reducing stream temperatures, as shown in temporal models, supporting its importance in managing stream-temperature rise. For spatial predictions, the simplest model used six predictors, excluding the shade factor.

Download: (2.203Mb)
Show Full MetaData Hide Full MetaData
Get RIS
Item Order
Go To Publisher
Price: 5000 Rial
Statistics

Machine Learning Models for Prediction of Shade-Affected Stream Temperatures

URI

http://yetl.yabesh.ir/yetl1/handle/yetl/4304888

Collections

Journal of Hydrologic Engineering

Show full item record

contributor author	Efrain Noa-Yarasca
contributor author	Meghna Babbar-Sebens
contributor author	Chris E. Jordan
date accessioned	2025-04-20T10:31:30Z
date available	2025-04-20T10:31:30Z
date copyright	11/29/2024 12:00:00 AM
date issued	2025
identifier other	JHYEFF.HEENG-6227.pdf
identifier uri	http://yetl.yabesh.ir/yetl1/handle/yetl/4304888
description abstract	This study evaluates the selection of predictor variables for machine learning (ML) models of stream temperatures affected by riparian shading. Two scenarios of model development were examined—ML models for the prediction of shade-affected stream temperatures at monitored locations (i.e., temporal prediction) and ML models for the prediction of shade-affected stream temperatures at unmonitored locations (i.e., spatial prediction) using data from the watershed’s nearby monitored gages. Notably, this study goes beyond previous research by examining the inclusion of riparian vegetation as a predictor variable alongside other commonly assessed predictors. The ultimate goal was to identify the optimal number of predictors as well as key predictors for developing efficient ML models in terms of accuracy and complexity. Two stations in the Dairy McKay Watershed (DMW) with long-term stream-temperature records were used to develop the temporal prediction ML models. For spatial prediction ML models, data from 29 monitored sites along the DMW were utilized. A total of 33 variables were analyzed, but variables exhibiting no temporal variation were excluded from temporal prediction ML models. The study revealed that ML models achieved reliable predictions with fewer predictors, reducing feature needs. Results showed that for temporal prediction ML models, a subset of four predictors (air temperature, stream flow, day of the year, and shade factor) resulted in the simplest model, with average RMSE values ranging between 0.78 and 0.88. For spatial prediction ML models, the simplest model consisted of six predictors excluding the shade factor, achieving an average RMSE of 0.54. Including riparian vegetation as a key predictor in temporal prediction ML models enabled us to assess its impact on mitigating stream-temperature rise. Thus, evaluation of three riparian vegetation scenarios using temporal prediction ML models showed their significant impact on lowering stream temperatures, aligning with physically based models. This study focuses on selecting the best predictor variables to estimate stream temperatures influenced by riparian shade using machine learning (ML) models. The research evaluated ML models for predicting stream temperatures at both monitored sites (temporal prediction) and unmonitored sites using nearby data (spatial prediction). A key aspect was including riparian vegetation as a predictor along with other common variables to improve model accuracy and simplicity. Data from two long-term monitoring stations in the Dairy McKay Watershed were used for temporal predictions, while data from 29 sites were used for spatial predictions. Thirty-three variables were assessed, covering climate, watershed and stream characteristics, seasonality, and site location. The study found that ML models can achieve accurate predictions with fewer predictors. For temporal predictions, the best model used four predictors: air temperature, stream flow, day of the year, and shade factor. Including riparian vegetation significantly helped in reducing stream temperatures, as shown in temporal models, supporting its importance in managing stream-temperature rise. For spatial predictions, the simplest model used six predictors, excluding the shade factor.
publisher	American Society of Civil Engineers
title	Machine Learning Models for Prediction of Shade-Affected Stream Temperatures
type	Journal Article
journal volume	30
journal issue	1
journal title	Journal of Hydrologic Engineering
identifier doi	10.1061/JHYEFF.HEENG-6227
journal fristpage	04024058-1
journal lastpage	04024058-14
page	14
tree	Journal of Hydrologic Engineering:;2025:;Volume ( 030 ):;issue: 001
contenttype	Fulltext

YaBeSH Engineering and Technology Library

Archive