Event detection

What is an event?

An event is a discrete period on the timeline that captures a notable feature within a time-series, such as a "flood peak". Event detection is concerned with automatically identifying the start and end datetimes of one or more of these features within a time-series and then filtering the verification pairs for evaluation. For example, if a time-series contains several flood peaks, a user may be interested in the average bias in the predictions from a model across these peaks or the individual timing errors in the peak values when compared to observed peaks.

More philosophically, event detection is an inherently subjective or parameterized analysis of a time-series. In other words, the precise number of events and the precise start and end datetimes of the individual events is rarely obvious. Event detection is merely an attempt to automate the selection of interesting periods for further analysis and evaluation and, in practice, it can require a significant amount of experimentation and iteration to identify what a user might consider to be reasonable event boundaries or interesting periods.

In principle, event detection may be applied to arbitrary time-series variables, such as precipitation, temperature or streamflow. In practice, however, the WRES currently offers only a single method for event detection, which was developed for river stage or streamflow time-series and may not work effectively for other variables.

How are events detected with the WRES?

Currently, the WRES offers a single event detection method, which is described here:

Regina, J.A., and Ogden, F.L. (2021). Automated correction of systematic errors in high-frequency depth measurements above V-notch weirs using time series decomposition. Hydrological Processes, 35(12), doi: https://doi.org/10.1002/hyp.14405.

The technique was developed for time-series of streamflow or river stage and decomposes the time-series "signal" into several additive contributions, namely:

A trend or low-frequency periodic component (e.g., in the context of streamflow, a baseflow, seasonal scale variation);
A high-frequency periodic component (e.g., daily scale variation, such as evapotranspiration);
A high-frequency non-periodic component (e.g., precipitation-driven runoff); and
A high-frequency non-structural or noise component (e.g., instrument error or white noise with zero autocorrelation).

The technique aims to model each of these components separately in order to finally remove the contributions from (1), (2) and (4), leaving (3), namely the high-frequency non-periodic component, which is composed of "events".

The technique affords several user-defined parameters, which are all declared in duration units (e.g., hours or days):

A "window size", which represents the size of the window used to detect and remove the trend component of the time-series. Larger durations will detect more slowly varying trends.
A "half-life", which is used to perform exponential weighted averaging of the time-series to eliminate noise and reduce the detection of normal instrument fluctuations as events. More specifically, it is the time lag at which the exponential weights decay by one half. Longer durations lead to greater smoothing.
A "minimum event duration", which is the shortest event period that will be admitted for evaluation. This may be used to eliminate shorter or "noisier" events from consideration.
A "start radius", which is the period relative to the event start datetime to look for a local minimum or "better" start datetime for an event, following smoothing (i.e., a datetime whose variable value is smaller than the initially selected event start datetime). A longer "start radius" provides greater scope to impart a larger phase shift on the initially selected start datetimes (i.e., to advance or delay the events).

While the WRES will attempt to choose reasonable default values for each of these parameters, it is anticipated that some trial and error will be required by the user when the default events are unsatisfactory. In practice, the number and timing of the detected events can be sensitive to the choice of parameter values, which is both useful (i.e., affords flexibility) and a source of subjectivity.

The range and default values of the parameters are further described in What are the possible values of the event detection parameters?

Can I detect events for multiple time-series at once?

Yes, event detection can be applied to an arbitrary number of time-series across any number of data sources (i.e., observed, predicted, baseline or covariates). Additionally, there are strategies for:

Combining events across different data sources, including:
- The union of events, i.e., the superset that contains all of the events detected for the individual data sources; and
- The intersection of events, i.e., the events that overlap on the timeline or intersect across all of the various data sources, simultaneously.
Aggregating events that intersect, using various aggregation policies, notably:
- The maximum duration spanned by all of the intersecting events (i.e., the outermost datetimes of all the intersecting events);
- The minimum duration spanned by all of the intersecting events (i.e., only those datetimes where all events have intersecting values); and
- The average duration spanned by all of the intersecting events (i.e., the average of the start datetimes and, separately, the end datetimes of the intersecting events).

For example, to conduct an evaluation for those periods where a "flood peak" was detected in both an observed and predicted time-series and to evaluate the paired values where either the observed or predicted time-series registered a "flood event", then an "intersection" operation may be appropriate, together with a "maximum" duration for the aggregation policy. Alternatively, to conduct an evaluation for those periods where a "flood peak" was detected in either the observed or predicted time-series, a "union" operation would be appropriate.

Other set operations may be considered in future, such as the "symmetric difference" (for periods that contain an event in only one dataset) and "complement" (for periods that do not contain an event in either dataset).

How do I declare event detection?

In the simplest case, event detection may be declared as follows:

observed: observed.csv
predicted: predicted.csv
event_detection: observed

In this example, event detection will be conducted with the observed data source using some default parameter values. All other options (e.g., metrics to calculate, formats to write) will be chosen by the software.

In other cases, an evaluation may request event detection for multiple data sources and may prefer to define some or all of the available parameter values. For example:

observed: observed.csv
predicted: predicted.csv
event_detection:
  dataset: 
    - observed
    - predicted
  parameters:
    window_size: 4800
    start_radius: 380
    half_life: 6
    minimum_event_duration: 120
    duration_unit: hours
    combination:
      operation: intersection
      aggregation: maximum

In this example, event detection will be performed for both the observed and predicted time-series using the prescribed window_size, start_radius, half-life and minimum_event_duration. Finally, the events detected across the two data sources will be combined and aggregated by forming their temporal intersection and by considering the maximum period spanned by each pair of intersecting events (i.e., because there are two data sources).

Can I perform event detection using a dataset that is not part of the evaluation itself?

Yes, covariate datasets are supported and event detection may be performed with covariate datasets. For example:

observed: observed.csv
predicted: predicted.csv
covariates: covariate_observations.csv
event_detection: covariates

Here, the observed and predicted datasets will be evaluated for the event periods detected using the covariates dataset. If multiple covariates are declared, event detection may be constrained to a specific subset of covariates by declaring the purpose of each covariate as either or both of detect (will be used for event detection) and filter (will be used for filtering). For example:

observed: observed.csv
predicted: predicted.csv
covariates:
  - sources: precipitation.csv
    purpose: detect
  - sources: temperature.csv
    maximum: 0
event_detection: covariates

In this case, only the precipitation.csv will be used for event detection. When event_detection is declared and the purpose of a covariate is undeclared, then the context of that covariate is inspected to determine whether it should be used for event detection or filtering. Specifically, if there are no filtering parameters (i.e., no minimum or maximum), then the purpose is assumed to be detect, otherwise filter. Thus, while the purpose helps to clarify the role of the precipitation covariate in the above example, it is not strictly needed.

What are the possible values of the event detection parameters?

The event detection parameters and their possible values and defaults are tabulated below.

Parameter	Declared name	Purpose	Possible values	Default
Method	`method`	The event detection method name.	`regina ogden` or, equivalently, `default`	`regina ogden`
Window size	`window_size`	The size of the window used to detect and remove the trend component.	Any value greater than zero duration units.	Ten times the `half_life` when defined, else two hundred times the average timestep, which is detected from the time-series data.
Half life	`half_life`	The time lag at which the weights in the exponential smoothing decay by one half.	Any value greater than zero duration units.	One tenth of the `window_size` when defined, else twenty times the average timestep, which is detected from the time-series data.
Minimum event duration	`minimum_event_duration`	The minimum duration of any event accepted for evaluation.	Any value greater than or equal to zero duration units.	Zero.
Start radius	`start_radius`	The period to search for a refined start datetime and effectively advance or delay the start of the event.	Any value greater than or equal to zero duration units.	Zero.
Duration unit	`duration_unit`	The units of the duration parameters (one unit for all parameters).	`seconds`, `minutes`, `hours` and `days`.	None, i.e., it is required when duration parameters are declared.
Operation	`operation`, only when `combination` is declared	The set operator to use when combining events across multiple time-series datasets.	`union` or `intersection`.	None when `combination` is declared, otherwise `union`.
Aggregation	`aggregation`, only when `combination` is declared and the `operation` is `intersection`	The aggregation method to use when identifying the overall event period across intersecting event periods.	`minimum`, `maximum` or `average`.	None.

Do you have a simple example of event detection, in practice?

Consider the following pair of synthetic time-series, one that nominally represents an "observed" series and one that represents a "predicted" series. These two time-series are sine-waves that span an arbitrary track length of one calendar year from 2023-01-01T00:00:00Z until 2023-12-31T23:59:59Z and have an arbitrary amplitude and frequency, leading to eight peaks and seven troughs or six complete cycles with two partially complete cycles. The sine waves are sampled every 144 minutes (1/10th of one day), leading to 3,650 sampled values.

example3a

In this case, the "observed" time-series is phase-shifted by 192 hours to generate the "predicted" time-series, so the timing of each predicted peak is 192 hours later than its corresponding observed peak.

Consider the following declaration, which references these two time-series datasets for event detection:

observed: observations.csv
predicted: predictions.csv
event_detection:
  dataset: 
    - observed
    - predicted
  parameters:
    combination:
      operation: intersection
      aggregation: maximum
metrics:
  - name: time to peak error
    summary_statistics:
      - mean
      - standard deviation
output_formats:
  - png

In this case, the events detected across the two data sources (observed and predicted) are combined by forming their intersection and aggregated by finding the maximum period spanned by the pairs of intersecting events. Otherwise, default parameter values are used for event detection. The timing errors are then calculated for each pair of detected (intersected) events. Summary statistics are also requested, specifically the mean and standard deviation of the timing errors across all cases.

This results in the following detected events (shaded):

example3

Note that only one of the partially complete cycles contained sufficient data for event detection, at least with the default parameter values, which leads to seven detected events.

The timing errors are written to a PNG image, as declared (an arbitrary location identifier and variable name was assigned to each time-series in the CSV source datasets):

FAKE1_FAKE1_TIME_TO_PEAK_ERROR

There are seven timing errors, each corresponding to one pair of events. Each timing error is recorded against the start datetime of its corresponding event ("earliest valid time").

The summary statistics are shown below:

FAKE1_FAKE1_TIME_TO_PEAK_ERROR_STATISTIC

As indicated in both plots, there is a constant timing error of 192 hours, in keeping with the design of this example.

To reproduce this example, you may use the observations and predictions below, together with the above declaration:

Observations: observations.csv
Predictions: predictions.csv

Where can I find the exact period spanned by each event?

The events are recorded in all of the output formats, including the numerical formats, where the precise event start and end datetimes are shown.

For example, the CSV2 format contains a column, EARLIEST VALID TIME EXCLUSIVE, which records the start datetime of each event and a column, LATEST VALID TIME INCLUSIVE, which records the end datetime of each event. The datetimes are formatted using the ISO 8601 standard. Thus, a detected "event" is described in exactly the same way as a manually declared event period that is spanned by two, user-defined, valid datetimes (also known as a "time window").

In addition, the time windows used in an evaluation, whether declared or detected, are recorded in the application or service log, which is primarily intended for developers, but is also available for users to examine. For example:

2025-01-17T17:50:11.599+0000 29212 [main] INFO EvaluationUtilities - Created 7 pool requests, which include 1 feature groups and 7 time windows. The feature groups are: [ FAKE1-FAKE1 ]. The time windows are: [ TimeWindowOuter[earliestReferenceTime=-1000000000-01-01T00:00:00Z,latestReferenceTime=+1000000000-12-31T23:59:59.999999999Z,earliestValidTime=2023-02-12T16:48:00Z,latestValidTime=2023-03-12T12:00:00Z,earliestLeadDuration=PT-2562047788015215H-30M-8S,latestLeadDuration=PT2562047788015215H30M7.999999999S], TimeWindowOuter[earliestReferenceTime=-1000000000-01-01T00:00:00Z,latestReferenceTime=+1000000000-12-31T23:59:59.999999999Z,earliestValidTime=2023-04-03T16:48:00Z,latestValidTime=2023-05-01T12:00:00Z,earliestLeadDuration=PT-2562047788015215H-30M-8S,latestLeadDuration=PT2562047788015215H30M7.999999999S], TimeWindowOuter[earliestReferenceTime=-1000000000-01-01T00:00:00Z,latestReferenceTime=+1000000000-12-31T23:59:59.999999999Z,earliestValidTime=2023-05-23T16:48:00Z,latestValidTime=2023-06-20T12:00:00Z,earliestLeadDuration=PT-2562047788015215H-30M-8S,latestLeadDuration=PT2562047788015215H30M7.999999999S], TimeWindowOuter[earliestReferenceTime=-1000000000-01-01T00:00:00Z,latestReferenceTime=+1000000000-12-31T23:59:59.999999999Z,earliestValidTime=2023-07-12T16:48:00Z,latestValidTime=2023-08-09T12:00:00Z,earliestLeadDuration=PT-2562047788015215H-30M-8S,latestLeadDuration=PT2562047788015215H30M7.999999999S], TimeWindowOuter[earliestReferenceTime=-1000000000-01-01T00:00:00Z,latestReferenceTime=+1000000000-12-31T23:59:59.999999999Z,earliestValidTime=2023-08-31T16:48:00Z,latestValidTime=2023-09-28T12:00:00Z,earliestLeadDuration=PT-2562047788015215H-30M-8S,latestLeadDuration=PT2562047788015215H30M7.999999999S], TimeWindowOuter[earliestReferenceTime=-1000000000-01-01T00:00:00Z,latestReferenceTime=+1000000000-12-31T23:59:59.999999999Z,earliestValidTime=2023-10-20T16:48:00Z,latestValidTime=2023-11-17T12:00:00Z,earliestLeadDuration=PT-2562047788015215H-30M-8S,latestLeadDuration=PT2562047788015215H30M7.999999999S], TimeWindowOuter[earliestReferenceTime=-1000000000-01-01T00:00:00Z,latestReferenceTime=+1000000000-12-31T23:59:59.999999999Z,earliestValidTime=2023-12-09T16:48:00Z,latestValidTime=2023-12-31T19:12:00Z,earliestLeadDuration=PT-2562047788015215H-30M-8S,latestLeadDuration=PT2562047788015215H30M7.999999999S] ].

Are there any limitations with event detection?

Yes, there are several, both in general and with the WRES, specifically:

By definition, the existence and characteristics of an event and whether it should qualify for evaluation is a subjective analysis, as with many aspects of a statistical evaluation. There is no objective analysis of what qualifies as an event or when an event begins or ends, except in the superficial sense that consistent rules may be defined. The goal with event detection in the WRES is rather modest (but also challenging), namely to automate the selection of events in a way that reasonably agrees with the events a user might select by visually inspecting the time-series.
There is currently only one event detection algorithm available. The algorithm is designed for river stage or streamflow, primarily in fast-responding basins whose time-series are characterized by distinct, runoff-driven, "peaks" (and, therefore, amenable to separation or decomposition). It may not work well in slow responding basins or for detecting indistinct or "low frequency" peaks. It is unlikely to work well for other variables, particularly discontinuous or mixed variables, such as precipitation.
Traditional evaluation techniques should be used with caution for event periods. For example, events will typically span a short period with a correspondingly small sample size, which may be inappropriate for traditional statistical evaluation. Instead, "event-based" measures or "signatures" such as the time to peak error should be favored (see the example, Do you have a simple example of event detection, in practice?). Only a small number of event-based or signature metrics are currently supported, but more will be added over time. These measures are only available for single-valued time-series. See the List of Metrics Available.
When evaluating events that are common to both of the datasets being compared (e.g., observed and predicted) - that is, an intersection - these statistics will, by definition, only represent the situations in which the predicted dataset correctly identified an event, at least for some of the intersecting period for which the observed dataset also identified an event. This is known as a "true positive". The statistics will not consider "false positives" or "false negatives", which occur when the predicted dataset incorrectly identified an event or incorrectly missed an event, respectively. Thus, it may be prudent to consider measures of false positives and false negatives alongside statistics for intersecting events. Remember, a flood prediction system that always predicts flooding is perfectly accurate when flooding occurs.
When calculating statistics for event periods that are not necessarily common to both of the datasets being compared (e.g., observed and predicted), such as a union, care should be taken in calculating and interpreting any event-based measures, such as the time to peak error. In this situation, when an event was detected for only one of the two datasets, the dataset without the event will have an arbitrary peak whose timing error with respect to the peak associated with the other dataset (i.e., the dataset that contained an event) may not be very meaningful.
Event detection is not currently supported for forecast datasets. In practice, statistical evaluations of forecasts may be performed with events detected using non-forecast datasets and, indeed, this may be the majority application for event detection with forecasts. Nevertheless, short- to medium-range forecasts will generally only span part of an event period, even in fast responding basins, so care is needed when interpreting event-based statistics for forecast datasets.
Currently, event detection is part of an evaluation workflow. However, it would be more flexible to decouple these two activities, as event detection will often require iteration before the evaluation statistics are useful. This would involve a separate web service for event detection, which would be leveraged by the WRES.

The WRES Wiki

Home

Provide feedback

Saved searches

Use saved searches to filter your results more quickly