Evaluating forecasts with the RPSS

Published April 24, 2009

In response to user feedback, the Southwest Climate Outlook has changed its temperature and precipitation forecast verification highlights to incorporate a more accurate evaluation method, the Rank Probability Skill Score (RPSS). To the mathematically wary, this name likely causes anxiety. Indeed, the RPSS is an equation and is complicated. But it helps answer a critical question: have the forecasts been accurate? Knowing this helps users incorporate the forecasts into decisions, such as when to purchase hay to avoid high costs or how much water to dole to irrigation districts.

Scientists often evaluate a forecast by calculating its skill, which is the accuracy of a forecast in relation to another, reference forecast. A “skillful” forecast shows improvement over the reference forecast. For example, a poker player may say he or she can beat the house more often than losing. If the game played has 50:50 odds, the poker player must win more than 50 percent of the games to show skill over the odds (the reference forecasts).

The National Oceanic and Atmospheric Administration’s Climate Prediction Center (NOAA-CPC) began forecasting successive three-month periods in 1994, and these forecasts spanned two weeks to 13 months into the future. But the usefulness of these forecasts depends on Evaluating forecasts with the RPSS their accuracy. If the forecasts have been historically worse than simply using a coin to predict the weather, than what value do they have?Figure 1. The new verification highlights incorporate a more sophisticated measure of forecast performance than the highlights featured in the past. The new color maps like this one that help readers visualize the historical accuracy of the forecasts.

To help address this question for readers, the Southwest Climate Outlook verification pages will present the average RPSS calculated for all the temperature and precipitation forecasts issued since 1994 for four different lead times. The RPSS is calculated by the Forecast Evaluation Tool, which was developed by The University of Arizona in partnership with NOAA, NASA, the National Science Foundation, and the University of California-Irvine.

In essence, the RPSS communicates how much more or less accurate the CPC forecasts have been than the reference forecast. The reference forecast for the CPC forecasts is equal probabilities that temperatures or precipitation will be one of three categories—“above,” “below,” or “neutral”—or a 33 percent chance for each category. These forecasts give probabilities, for example, that temperature will be similar to the 10 warmest, coolest, or normal temperatures observed during the period 1971– 2000. This equal probability is often referred to as a climatology forecast.

The actual formula of the RPSS is complicated and is beyond the scope of this article. The two important characteristics of the RPSS, however, are easily articulated. First, the higher the RPSS value, the better the forecast; the RPSS value is the percent improvement the forecast exhibits over the reference forecast. Positive values also give an indication that the forecasts and the actual weather conditions are similar—the higher the RPSS, the more similar the forecast and the actual conditions. Negative values, on the other hand, mean that the forecast is less accurate than the climatology forecast.

Second, the value of the RPSS incorporates the degree of correctness or incorrectness. This “ranked” scoring system values correct forecasts and incorrect forecasts differently—some inaccurate forecasts are worse than others. For example, if a forecast indicated a 90 percent chance for “above” temperatures but temperatures were actually “below,” the RPSS would be lower than if the forecast stated a 40 percent chance for “above” temperatures.

The usefulness of forecast verifications such as the RPSS becomes apparent in the example of an early forecaster. In 1884, Sergeant John Finley began forecasting tornado occurrences east of the Rocky Mountains. Shortly thereafter, he reported a 95.6–98.6 percent forecast accuracy. Other scientists, however, pointed out that the accuracy could have been 98.2 percent had he simply always forecasted no tornados. Although Finley’s forecasts seemed accurate, they were not the best forecasts. Had an RPSS been calculated, it would have been negative.

While forecasts will continue to be made—each additional year helps make the RPSS more robust—knowing the accuracy of past forecasts will help evaluate the usefulness of the current forecast.

For questions or comments, please contact Zack Guido, CLIMAS Associate Staff Scientist, at zguido@email.arizona.edu or (520) 882-0879.