|
3 | 3 | Evaluations Definition
|
4 | 4 | ======================
|
5 | 5 |
|
6 |
| -TBI |
| 6 | +**floatCSEP** evaluate forecasts using the testing procedures from **pyCSEP** (See `Testing Theory <https://docs.cseptesting.org/getting_started/theory.html>`_). Depending on the forecast type (e.g., **GriddedForecasts** or **CatalogForecasts**), different evaluation functions can be used. T |
| 7 | + |
| 8 | +Each evaluation specifies a `func` parameter, representing the evaluation function to be applied, and a `plot_func` parameter for visualizing the results. |
| 9 | + |
| 10 | +Evaluations for **GriddedForecasts** typically use functions from :mod:`csep.core.poisson_evaluations` or :mod:`csep.core.binomial_evaluations`, while evaluations for **CatalogForecasts** use functions from :mod:`csep.core.catalog_evaluations`. |
| 11 | + |
| 12 | +The structure of the evaluation configuration file is similar to the model configuration, with multiple tests, each pointing to a specific evaluation function and plotting method. |
| 13 | + |
| 14 | +**Example Configuration**: |
| 15 | + |
| 16 | +.. code-block:: yaml |
| 17 | +
|
| 18 | + - N-test: |
| 19 | + func: poisson_evaluations.number_test |
| 20 | + plot_func: plot_poisson_consistency_test |
| 21 | + - S-test: |
| 22 | + func: poisson_evaluations.spatial_test |
| 23 | + plot_func: plot_poisson_consistency_test |
| 24 | + plot_kwargs: |
| 25 | + one_sided_lower: True |
| 26 | + - T-test: |
| 27 | + func: poisson_evaluations.paired_t_test |
| 28 | + ref_model: Model A |
| 29 | + plot_func: plot_comparison_test |
| 30 | +
|
| 31 | +
|
| 32 | +Evaluation Parameters: |
| 33 | +---------------------- |
| 34 | + |
| 35 | +.. list-table:: |
| 36 | + :widths: 20 80 |
| 37 | + :header-rows: 1 |
| 38 | + |
| 39 | + * - **Parameter** |
| 40 | + - **Description** |
| 41 | + * - **func** (required) |
| 42 | + - The evaluation function, specifying which test to run. Must be an available function from the pyCSEP evaluation suite (e.g., `poisson_evaluations.number_test`). |
| 43 | + * - **plot_func** (required) |
| 44 | + - The function to plot the evaluation results, specified from the available plotting functions (e.g., `plot_poisson_consistency_test`). |
| 45 | + * - **plot_args** |
| 46 | + - Arguments passed to customize plot titles, labels, or font size. |
| 47 | + * - **plot_kwargs** |
| 48 | + - Keyword arguments passed to the plotting function for fine-tuning plot appearance (e.g., `one_sided_lower: True`). |
| 49 | + * - **ref_model** |
| 50 | + - A reference model against which the current model is compared in comparative tests (e.g., `Model A`). |
| 51 | + * - **markdown** |
| 52 | + - A description of the test to be used as caption when reporting results |
| 53 | + |
| 54 | + |
| 55 | +Evaluations Functions: |
| 56 | +---------------------- |
| 57 | + |
| 58 | +Depending on the type of forecast being evaluated, different evaluation functions are used: |
| 59 | + |
| 60 | +1. **GriddedForecasts**: |
| 61 | + |
| 62 | +.. list-table:: |
| 63 | + :widths: 20 80 |
| 64 | + :header-rows: 1 |
| 65 | + |
| 66 | + * - **Function** |
| 67 | + - **Description** |
| 68 | + * - **poisson_evaluations.number_test** |
| 69 | + - Evaluates the forecast by comparing the total number of forecasted events with the observed events using a Poisson distribution. |
| 70 | + * - **poisson_evaluations.spatial_test** |
| 71 | + - Compares the spatial distribution of forecasted events to the observed events. |
| 72 | + * - **poisson_evaluations.magnitude_test** |
| 73 | + - Evaluates the forecast by comparing the magnitude distribution of forecasted events with observed events. |
| 74 | + * - **poisson_evaluations.conditional_likelihood_test** |
| 75 | + - Tests the likelihood of observed events given the forecasted rates, conditioned on the total earthquake occurrences. |
| 76 | + * - **poisson_evaluations.paired_t_test** |
| 77 | + - Calculate the information gain between one forecast to a reference (``ref_model``), and test a significant difference by using a paired T-test. |
| 78 | + * - **binomial_evaluations.binary_spatial_test** |
| 79 | + - Binary spatial test to compare forecasted and observed event distributions. |
| 80 | + * - **binomial_evaluations.binary_likelihood_test** |
| 81 | + - Likelihood test likelihood of observed events given the forecasted rates, assuming a Binary distribution |
| 82 | + * - **binomial_evaluations.negative_binomial_number_test** |
| 83 | + - Evaluates the number of events using a negative binomial distribution, comparing observed and forecasted event counts. |
| 84 | + * - **brier_score** |
| 85 | + - Uses a quadratic metric rather than logarithmic. Does not penalize false-negatives as much as log-likelihood metrics |
| 86 | + * - **vector_poisson_t_w_test** |
| 87 | + - Carries out the paired_t_test and w_test for a single forecast compared to multiple. |
| 88 | + * - **sequential_likelihood** |
| 89 | + - Obtain the distribution of log-likelihoods in time. |
| 90 | + * - **sequential_information_gain** |
| 91 | + - Obtain the distribution of information gain in time, compared to a ``ref_model``. |
| 92 | + |
| 93 | + |
| 94 | +2. **CatalogForecasts**: |
| 95 | + |
| 96 | + |
| 97 | + |
0 commit comments