Feature 1259 es_prob_stats #2067

JohnHalleyGotway · 2022-02-24T15:46:12Z

Expected Differences

This PR is now ready for review.

Do these changes introduce new tools, command line arguments, or configuration file options? [Yes]

If yes, please describe:

Listed below are the additions to the Ensemble-Stat config file:

   prob_pct_thresh = [ ==0.25 ]; // to define Nx2 PCT table (totally new entry)
   eclv_points = 0.05; // to define ECLV points (already exists in other tools)

And adds entries to output_flag for newly supported prob line types (already exists in other tools):

   pct   = NONE;
   pstd  = NONE;
   pjc   = NONE;
   prc   = NONE;
   eclv  = NONE;

Do these changes modify the structure of existing or add new output data types (e.g. statistic line types or NetCDF variables)? [No]

If yes, please describe:

Pull Request Testing

Describe testing already performed for these changes:

Tested manually with 3 specific uses in mind:

When prob_cat_thresh is defined to list explicit thresholds, they are applied and corresponding prob stats are reported.
When prob_cat_thresh is defined, along with climo bins, each prob_cat_thresh is used to derive and ensemble relative frequency (shown as PROB(NAME>THRESH) in the output). Those pairs (ensemble probabilities / observations) are SUBSET into climo bins. Output is generated for each bin and the summary BIN_MEAN is reported as the average across those climo bins. In this case, the BIN_MEAN TOTAL column contains the sum of the bin totals.
When prob_cat_thresh is empty, climo bins are requested, and climo data is supplied, the probability thresholds are defined relative to the climo distribution. Instead of using climo bins to SUBSET, we use them to define the probability thresholds. The probability forecast are evaluated for each bin using ALL available obs, and the prob statistics are written to the output. The BIN_MEAN TOTAL column now contains the mean of the bin totals. Since all obs are used for each climo bins, those bin totals are a constant value.

Differentiating between these 3 uses is confusing.

Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:

Review the code changes, documentation updates (please check carefully for typos and readability), and unit test output differences.
Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes]
Do these changes include sufficient testing updates? [Yes]
Modified the configuration for 2 existing tests.
Will this PR result in changes to the test suite? [Yes]

If yes, describe the new output and/or changes to the existing output:

Generates 5 new output files for 2 existing tests. Also modified the contents of the .stat files for those tests. The new files are for the PCT, PSTD, PJC, PRC, and ECLV line types.
Please complete this pull request review by [Wed 3/2].

Pull Request Checklist

See the METplus Workflow for details.

Review the source issue metadata (required labels, projects, and milestone).
Complete the PR definition above.
Ensure the PR title matches the feature or bugfix branch name.
Define the PR metadata, as permissions allow.
Select: Reviewer(s)
Select: Organization level software support Project or Repository level development cycle Project
Select: Milestone as the version that will include these changes
After submitting the PR, select Linked issues with the original issue number.
After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
Close the linked issue and delete your feature or bugfix branch from GitHub.

…, PRJ, and ECLV outputs from ensemble-stat. Also added fcat_ta and ocat_ta ThreshArray objects to store the needed fcst and obs thresholds. Still need to actually update ensemble_stat.cc to use them and also need to make config file, unit test, and documentation updates.

…he writing of STAT output lines for both point and gridded verification into a common write_txt_files() function. In the process, I discovered that we were checking the top-level conf_info.output_flag option instead of the vx_opt specific ones. These changes switch to using the vx_opt ones so that we can control the output written task by task.

…tat output lines.

…stat to define how PCT thresholds. Still need to update documentation, tests, and config files.

JohnHalleyGotway · 2022-02-24T23:32:37Z

met/src/tools/core/ensemble_stat/ensemble_stat.cc

+   pd_pnt.extend(pd_ens.n_obs);
+
+   // Determine the number of climo CDF bins
+   n_bin = (pd_pnt.cmn_na.n_valid() > 0 && pd_pnt.csd_na.n_valid() > 0 ?


@j-opatz this is the spot in the code whose logic I'd like to discuss. I think if the user hasn't defined prob_cat_thresh but has defined climo_mean and climo_stdev along with climo bins, we want to use those thresholds to define/evaluate probabilities. But this is the logic that always gets so confusing!

…y.cc.

…istic outputs in 2 ways.

…nsistent with the default one.

…as consistent as possible with the default configuration.

…tput files created by ensemble-stat.

…utation of binned percentile thresholds.

…ether the total column should be summed or averaged. Previously, they were always summed since the climo bins were used to SUBSET the matched pairs. In Ensemble-Stat, the full set of pairs can now thresholded multiple times based on the climo bins. As such, the TOTAL value for each input should remain constant. Rather then summing those totals, they should now be averaged (but this is the average of a constant value).

…e types.

…ob_stats

j-opatz

The write-up is done well, given that the necessary topic of discussion is confusing. Unit tests seem to have the corresponding output, and while I think this feature correctly adds the requested ability in EnsembleStat, it will be best proven when provided back to the EMC requestor and it's used there.

Co-authored-by: Julie Prestopnik <jpresto@seneca.rap.ucar.edu> Co-authored-by: johnhg <johnhg@ucar.edu> Co-authored-by: Seth Linden <linden@kiowa.rap.ucar.edu> Co-authored-by: John Halley Gotway <johnhg@ucar.edu> Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com> Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu> Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu> Co-authored-by: jprestop <jpresto@ucar.edu> Co-authored-by: Howard Soh <hsoh@seneca.rap.ucar.edu> Co-authored-by: Seth Linden <linden@ucar.edu> Co-authored-by: hsoh-u <hsoh@ucar.edu> Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com> Co-authored-by: John Halley Gotway <johnhg@seneca.rap.ucar.edu> Co-authored-by: MET Tools Test Account <met_test@seneca.rap.ucar.edu> Co-authored-by: mo-mglover <78152252+mo-mglover@users.noreply.github.com> Co-authored-by: davidalbo <dave@ucar.edu> Co-authored-by: lisagoodrich <33230218+lisagoodrich@users.noreply.github.com>

JohnHalleyGotway added 7 commits February 22, 2022 17:35

Per #1259, add the eclv_points config option to Ensemble-Stat.

522c58d

Per #1259, ci-run-unit strip out commented-out old code for writing s…

500fa5b

…tat output lines.

Per #1259, no real change to point_stat. Just removing a blank line.

1ac8c61

Per #1259, making progress. Added prob_pct_thresh option to ensemble-…

eebe649

…stat to define how PCT thresholds. Still need to update documentation, tests, and config files.

ci-run-unit Merge branch 'develop' into feature_1259_es_prob_stats

730b8f9

JohnHalleyGotway added this to the MET 10.1.0 milestone Feb 24, 2022

JohnHalleyGotway marked this pull request as draft February 24, 2022 15:46

JohnHalleyGotway added 2 commits February 24, 2022 14:43

Merge branch 'develop' into feature_1259_es_prob_stats

38c0c45

Per #1259, update the Ensemble-Stat chapter of the user's guide.

17d81fc

JohnHalleyGotway commented Feb 24, 2022

View reviewed changes

JohnHalleyGotway added 9 commits February 25, 2022 17:49

Per #1259, move the derivation of cdp thresholds into the thresh_arra…

422efa5

…y.cc.

Per #1259, ci-run-unit add logic to ensemble-stat to compute probabil…

7a82732

…istic outputs in 2 ways.

Per #1259, make the MET test script Ensemble-Stat config file more co…

1cc8594

…nsistent with the default one.

Per #1259, wrangling Ensemble-Stat config files, trying to make them …

77063ec

…as consistent as possible with the default configuration.

Per #1259, update unit_met_test_scripts.xml to accurately list the ou…

210a240

…tput files created by ensemble-stat.

Per #1259, update unit_climatology_1.0deg.xml to demonstrate the comp…

157a33a

…utation of binned percentile thresholds.

Per #1259, fix the logic for how to size the probabilistic output lin…

64568dd

…e types.

Per #1259, remove debug print statement.

5f68b23

JohnHalleyGotway requested a review from j-opatz March 1, 2022 23:05

JohnHalleyGotway linked an issue Mar 1, 2022 that may be closed by this pull request

Enhance Ensemble-Stat to compute probabilistic statistics for user-defined or climatology-based thresholds. #1259

Closed

21 tasks

JohnHalleyGotway marked this pull request as ready for review March 1, 2022 23:18

JohnHalleyGotway mentioned this pull request Mar 1, 2022

Add support for probabilistic verification to the Ensemble-Stat wrapper. dtcenter/METplus#1464

Closed

20 tasks

JohnHalleyGotway added 2 commits March 2, 2022 09:11

Merge branch 'develop' into feature_1259_es_prob_stats

eed918c

Merge remote-tracking branch 'origin/develop' into feature_1259_es_pr…

5172d6c

…ob_stats

j-opatz approved these changes Mar 2, 2022

View reviewed changes

JohnHalleyGotway merged commit 514c0c3 into develop Mar 2, 2022

JohnHalleyGotway deleted the feature_1259_es_prob_stats branch March 2, 2022 18:21

JohnHalleyGotway mentioned this pull request Mar 2, 2022

Update develop-ref after #2067 #2080

Merged

JohnHalleyGotway restored the feature_1259_es_prob_stats branch March 3, 2022 00:10

JohnHalleyGotway removed a link to an issue Mar 3, 2022

Enhance Ensemble-Stat to compute probabilistic statistics for user-defined or climatology-based thresholds. #1259

Closed

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature 1259 es_prob_stats #2067

Feature 1259 es_prob_stats #2067

JohnHalleyGotway commented Feb 24, 2022 •

edited

Loading

JohnHalleyGotway Feb 24, 2022

j-opatz left a comment

Feature 1259 es_prob_stats #2067

Feature 1259 es_prob_stats #2067

Conversation

JohnHalleyGotway commented Feb 24, 2022 • edited Loading

Expected Differences

Pull Request Testing

Pull Request Checklist

JohnHalleyGotway Feb 24, 2022

Choose a reason for hiding this comment

j-opatz left a comment

Choose a reason for hiding this comment

JohnHalleyGotway commented Feb 24, 2022 •

edited

Loading