Hypothesis Tests for Evaluating Numerical Precipitation ForecastsSource: Weather and Forecasting:;1999:;volume( 014 ):;issue: 002::page 155Author:Hamill, Thomas M.
DOI: 10.1175/1520-0434(1999)014<0155:HTFENP>2.0.CO;2Publisher: American Meteorological Society
Abstract: When evaluating differences between competing precipitation forecasts, formal hypothesis testing is rarely performed. This may be due to the difficulty in applying common tests given the spatial correlation of and non-normality of errors. Possible ways around these difficulties are explored here. Two datasets of precipitation forecasts are evaluated, a set of two competing gridded precipitation forecasts from operational weather prediction models and sets of competing probabilistic quantitative precipitation forecasts from model output statistics and from an ensemble of forecasts. For each test, data from each competing forecast are collected into one sample for each case day to avoid problems with spatial correlation. Next, several possible hypothesis test methods are evaluated: the paired t test, the nonparametric Wilcoxon signed-rank test, and two resampling tests. The more involved resampling test methodology is the most appropriate when testing threat scores from nonprobabilistic forecasts. The simpler paired t test or Wilcoxon test is appropriate to use in testing the skill of probabilistic forecasts evaluated with the ranked probability score.
|
Collections
Show full item record
contributor author | Hamill, Thomas M. | |
date accessioned | 2017-06-09T14:57:03Z | |
date available | 2017-06-09T14:57:03Z | |
date copyright | 1999/04/01 | |
date issued | 1999 | |
identifier issn | 0882-8156 | |
identifier other | ams-3033.pdf | |
identifier uri | http://onlinelibrary.yabesh.ir/handle/yetl/4167657 | |
description abstract | When evaluating differences between competing precipitation forecasts, formal hypothesis testing is rarely performed. This may be due to the difficulty in applying common tests given the spatial correlation of and non-normality of errors. Possible ways around these difficulties are explored here. Two datasets of precipitation forecasts are evaluated, a set of two competing gridded precipitation forecasts from operational weather prediction models and sets of competing probabilistic quantitative precipitation forecasts from model output statistics and from an ensemble of forecasts. For each test, data from each competing forecast are collected into one sample for each case day to avoid problems with spatial correlation. Next, several possible hypothesis test methods are evaluated: the paired t test, the nonparametric Wilcoxon signed-rank test, and two resampling tests. The more involved resampling test methodology is the most appropriate when testing threat scores from nonprobabilistic forecasts. The simpler paired t test or Wilcoxon test is appropriate to use in testing the skill of probabilistic forecasts evaluated with the ranked probability score. | |
publisher | American Meteorological Society | |
title | Hypothesis Tests for Evaluating Numerical Precipitation Forecasts | |
type | Journal Paper | |
journal volume | 14 | |
journal issue | 2 | |
journal title | Weather and Forecasting | |
identifier doi | 10.1175/1520-0434(1999)014<0155:HTFENP>2.0.CO;2 | |
journal fristpage | 155 | |
journal lastpage | 167 | |
tree | Weather and Forecasting:;1999:;volume( 014 ):;issue: 002 | |
contenttype | Fulltext |