Environmental Modeling Center Environmental Modeling Center Environmental Modeling Center United States Department of Commerce

The content provided on this page supports model development. These are not official NWS products and should not be relied upon for operational purposes. This web site is not subject to 24/7 support, and thus may be unavailable during system outages.

Please see our disclaimer for more information.

GEFS-Wave Satellite Verification


Forecast (hours)

These statistics are based on the ensemble mean fields

These statistics are based on the ensemble member fields

General Information

These pages show a comparison of two satellite platforms, Jason-3 and ALTiKa/SARAL with three WAVEWATCH III® implementations:

  1. GEFSv12-Waves
  2. GWES
  3. Multigrid WW3 (Multi_1)

WorkFlow Description

  1. convert all GRiB files to netcdf
  2. for the GEFS and GWES ensembles, compute a daily mean field
  3. use the mean field for the plotting and statistics
  4. extract the satellite data for a 60 minute window centered on 00Z
  5. extract the model nowcast/forecasts, syncronized to valid time
  6. interpolate the model grid to the satellite track using 8 nearest neighbors with gaussian weights
  7. plot the parameter along-track
  8. compute daily statistics and save to a SQL database
  9. Extract and plot 30 days time series of statistics

Statistics Definitions


The mean difference between the model and observations, measures the tendency of the model process to over- or under-estimate the value of a parameter. Smaller absolute bias values indicate better agreement between measured and calculated values. Positive bias means overprediction, negative means underprediction.

        diff = model_data - observations
        bias = diff.mean()       

Root-Mean-Square Error (RMS Error)

Also called the root-mean-squared deviation, it's a measure of the differences between the observed and predicted values. Smaller RMSE values indicate better agreement between measured and calculated values.


Scatter Index (SI)

Defined as the standard deviation of the difference between model and observations, normalised by the mean of the observations. Smaller values of SI indicate better agreement between the model and observations.

        scatter_index=100.0*(((diff**2).mean())**0.5 - bias**2)/observations.mean()

Ensemble Statistics Definitions

Talagrand Diagram

Rank histograms (sometimes called verification rank histograms or Talagrand diagrams) are a way to show how reliable an ensemble forecast is compared to a set of newly observed data. In other words, they show the bias for the model. If an ensemble forecast is accurate, the rank histogram — a graph of observed data — will be flat. Deviations from a uniform distribution (i.e. histogram blocks that are above or below the red line) mean that the model isn’t completely accurate. These types of diagrams are not commonly used outside of ensemble forecasting. (2016 Statistics How To)

Continuous Ranked Probability Score

Calculated as the difference between the observations continuous distribution function (CDF) and the forecasts CDF. Defaults to the Mean Absolute Error in the case of a deterministic forecast.

Brier Threshold Score

Brier Score for an ensemble for exceeding the specified threshold. It calculates the mean squared error between predicted probabilities and the expected values. The score summarizes the magnitude of the error in the probability forecasts. The score falls between 0.0 and 1.0, with perfect skill having a score of 0.0.