Add Metrics and Simple Baselines #1

jacobbieker · 2023-01-12T11:24:13Z

jacobbieker · 2023-01-12T11:24:34Z

@peterdudfield @dantravers Any other metrics or simple baselines I should include?

peterdudfield · 2023-01-12T11:49:47Z

some discussion in here - https://docs.google.com/document/d/1E9pccSVVIfn8m14fUqBCVLWKNiU1dUe_zgTcurqLWww/edit

Probably good to include

RMSE, not just MAE.
MAE, as well as NMAE
Count of errors greater than 1 (and 2) GW (this only makes sense for national forecast)
option to keep night time data, or not (this can be defined by sun angle)
hmmmmmm,

There's are lots ofthers, but making a v1 first would be good, then we can slow add to them. Good to make it module becasue of this

Also: I'm not sure if nowcasting_utils is the right place, i think all that code is quite out of data. I'd be tempted to make a new repo, ocf-ml-metrics.

peterdudfield · 2023-01-12T11:49:56Z

btw: thanks for making the issue @jacobbieker

jacobbieker · 2023-01-12T11:56:29Z

Yeah, sounds good! Just made the repo, will move this over

peterdudfield · 2023-01-12T11:58:29Z

Good use of 'transfer' feature

dantravers · 2023-01-12T16:10:51Z

Sounds good! Another "model" I would run is to compare against is PV_Live intraday versus PV_Live updated. This gives an estimate of accuracy for national and GSP that we know we want to beat.

To generalise the "large errors" - for National I think this is good to look at errors > 1GW (or maybe 2GW). For site level or GSP level, I would apply a statistical measure. Propose to count the % of errors which are greater than 1.65sigma, where sigma is the standard deviation of the timeseries of outturns. 1.65sigma equates to the 5 / 95% range of outturns for the site.
Could be any threshold, but this is probably as good as any.

jacobbieker · 2023-01-12T16:17:30Z

Okay, sounds good, I've added those. How much intraday PV Live do we have @peterdudfield ? Don't remember when we started collecting it

peterdudfield · 2023-01-12T16:23:40Z

We avtaully have a few years worth, Jame re run some things.

I would suggestion we use Plive intraday as another baseline mode, rather than intagle each model with PV Live intraday. That make sense?

peterdudfield · 2023-01-12T16:25:10Z

Sounds good! Another "model" I would run is to compare against is PV_Live intraday versus PV_Live updated. This gives an estimate of accuracy for national and GSP that we know we want to beat.

To generalise the "large errors" - for National I think this is good to look at errors > 1GW (or maybe 2GW). For site level or GSP level, I would apply a statistical measure. Propose to count the % of errors which are greater than 1.65_sigma, where sigma is the standard deviation of the timeseries of outturns. 1.65_sigma equates to the 5 / 95% range of outturns for the site. Could be any threshold, but this is probably as good as any.

Perhaps a straigh forward threshol could be v0, then for v1 we could look at something a bit more eloborate

jacobbieker · 2023-01-12T16:25:27Z

Yeah, that's what I was thinking for the model. We can compute the errors and save those too if we want to, but would do that later. Comparing to the day after PV Live is already what the other error metrics do so don't think we need to include that separately. Is the intraday saved somewhere in a file?

I would suggestion we use Plive intraday as another baseline mode, rather than intagle each model with PV Live intraday. That make sense?

dantravers · 2023-02-13T15:38:03Z

% of errors above 1.65 sigma, with sigma being the standard deviation of the outturn

Thinking about this further - I would simplify and make the large errors anything above a % of the capacity for that region / site. E.g. 10% of installed capacity.

dantravers · 2023-02-13T15:44:56Z

I would suggest the following metrics are the "headline" metrics that we use to compare models at the first pass, and then look at others for more detail:

nMAE
split by forecast horizon

For national forecasts:

% errors > 1GW
MAE (although doesn't provide more information than nMAE, it is humanly understood)
split by forecast horizon.

For site-level forecasts:

% errors > 10% of installed capacity (large errors - an equivalent of the 1GW for national)

The errors by time of day, season, etc, are useful, and should be used for more detailed comparisons of models. It would be good to standardise on a way to present these metrics. I.e. a particular grid format.

jacobbieker added the enhancement New feature or request label Jan 12, 2023

jacobbieker self-assigned this Jan 12, 2023

jacobbieker transferred this issue from openclimatefix/nowcasting_utils Jan 12, 2023

jacobbieker mentioned this issue Jan 13, 2023

Add Simple Baselines and Metrics #2

Merged

6 tasks

jacobbieker closed this as completed in #2 Feb 8, 2023

jacobbieker mentioned this issue Feb 13, 2023

Create "Headline Metrics" #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Metrics and Simple Baselines #1

Add Metrics and Simple Baselines #1

jacobbieker commented Jan 12, 2023 •

edited

Loading

jacobbieker commented Jan 12, 2023

peterdudfield commented Jan 12, 2023

peterdudfield commented Jan 12, 2023

jacobbieker commented Jan 12, 2023

peterdudfield commented Jan 12, 2023

dantravers commented Jan 12, 2023

jacobbieker commented Jan 12, 2023

peterdudfield commented Jan 12, 2023

peterdudfield commented Jan 12, 2023

jacobbieker commented Jan 12, 2023

dantravers commented Feb 13, 2023 •

edited

Loading

dantravers commented Feb 13, 2023 •

edited

Loading

Add Metrics and Simple Baselines #1

Add Metrics and Simple Baselines #1

Comments

jacobbieker commented Jan 12, 2023 • edited Loading

jacobbieker commented Jan 12, 2023

peterdudfield commented Jan 12, 2023

peterdudfield commented Jan 12, 2023

jacobbieker commented Jan 12, 2023

peterdudfield commented Jan 12, 2023

dantravers commented Jan 12, 2023

jacobbieker commented Jan 12, 2023

peterdudfield commented Jan 12, 2023

peterdudfield commented Jan 12, 2023

jacobbieker commented Jan 12, 2023

dantravers commented Feb 13, 2023 • edited Loading

dantravers commented Feb 13, 2023 • edited Loading

jacobbieker commented Jan 12, 2023 •

edited

Loading

dantravers commented Feb 13, 2023 •

edited

Loading

dantravers commented Feb 13, 2023 •

edited

Loading