From d7ecff341e11af8879d0bcf7a0e1b0f7ab771452 Mon Sep 17 00:00:00 2001 From: lisagoodrich <33230218+lisagoodrich@users.noreply.github.com> Date: Mon, 30 Nov 2020 17:16:09 -0700 Subject: [PATCH] Task 1455 doc (#1585) * first stab at converting to sphinx * removing all slashes * adding new link to README.rst file * working on lists * Made formatting changes * Finished fcst section * fixing spelling, bolding and italics issues * updating web links * working on formatting * updating formatting * formatting * first attempt to clean up formatting completed. * adding README to the index file * fixing warning errors * Bringing README_TC into sphinx. Updating section headers * Adding README_TC * Made formatting updates to README.rst * corrected section under wavelet * small changes * removing met/data/config/README since it is now in met/docs/Users_Guide * Added some formatting for headers * fixing chapters & sections * Fixed warnings from building * adding in code blocks * removing slashes * changes * Made changes to formatting * removing For example code blocks * major updates * fist pass at document conversion complete. * cleaning up questions about dashes * Made some formatting modifications * Removing README_TC because it is being replaced by README_TC.rst in met/docs/Users_Guide * Removing the reference to the README_TC file * Making title capitalization consistent with README * Added a space in timestring * changing to 'time string' with a space between the words. * adding a link to the new README_TC location in met/docs/Users_Guide * Modified references to README and README_TC * small formatting changes * small formatting changes * fixing tabs * fixing spacing around number 11 * removing parenthesis around reference dates. * adding parenthesis back in. * fixing references * updating references * Update appendixC.rst Removed space from "HAUSDOR FF" * Update plotting.rst Changed a couple of references of Plot_Point_Obs to Plot-Point-Obs * Update point-stat.rst Added oxford commas Co-authored-by: Julie.Prestopnik --- met/docs/Users_Guide/appendixA.rst | 2 +- met/docs/Users_Guide/appendixC.rst | 27 ++++++----- met/docs/Users_Guide/appendixF.rst | 6 +-- met/docs/Users_Guide/ensemble-stat.rst | 6 +-- met/docs/Users_Guide/grid-diag.rst | 2 +- met/docs/Users_Guide/grid-stat.rst | 6 +-- met/docs/Users_Guide/gsi-tools.rst | 24 +++++----- met/docs/Users_Guide/mode-analysis.rst | 4 +- met/docs/Users_Guide/overview.rst | 2 +- met/docs/Users_Guide/plotting.rst | 18 +++---- met/docs/Users_Guide/reformat_grid.rst | 2 +- met/docs/Users_Guide/reformat_point.rst | 13 +---- met/docs/Users_Guide/release-notes.rst | 60 +++++++++++++++--------- met/docs/Users_Guide/series-analysis.rst | 2 +- met/docs/Users_Guide/stat-analysis.rst | 2 +- met/docs/Users_Guide/tc-pairs.rst | 2 +- met/docs/Users_Guide/wavelet-stat.rst | 10 ++-- 17 files changed, 97 insertions(+), 91 deletions(-) diff --git a/met/docs/Users_Guide/appendixA.rst b/met/docs/Users_Guide/appendixA.rst index d94bf33d11..55872cf77d 100644 --- a/met/docs/Users_Guide/appendixA.rst +++ b/met/docs/Users_Guide/appendixA.rst @@ -44,7 +44,7 @@ A. Currently, very few graphics are included. The plotting tools (plot_point_obs **Q. How do I find the version of the tool I am using?** -A. Type the name of the tool followed by -version. For example, type “pb2nc -version”. +A. Type the name of the tool followed by **-version**. For example, type “pb2nc **-version**”. **Q. What are MET's conventions for latitude, longitude, azimuth and bearing angles?** diff --git a/met/docs/Users_Guide/appendixC.rst b/met/docs/Users_Guide/appendixC.rst index b7fd8cf412..c998dd0fc2 100644 --- a/met/docs/Users_Guide/appendixC.rst +++ b/met/docs/Users_Guide/appendixC.rst @@ -251,14 +251,14 @@ OR measures the ratio of the odds of a forecast of the event being correct to th .. math:: \text{OR } = \frac{n_{11} \times n_{00}}{n_{10} \times n_{01}} = \frac{(\frac{\text{POD}}{1 - \text{POD}})}{(\frac{\text{POFD}}{1 - \text{POFD}})}. -OR can range from 0 to :math:`\infty`. A perfect forecast would have a value of OR = infinity. OR is often expressed as the log Odds Ratio or as the Odds Ratio Skill Score (:ref:`Stephenson 2000 `). +OR can range from 0 to :math:`\infty`. A perfect forecast would have a value of OR = infinity. OR is often expressed as the log Odds Ratio or as the Odds Ratio Skill Score (:ref:`Stephenson, 2000 `). Logarithm of the Odds Ratio (LODDS) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Called "LODDS" in CTS output :numref:`table_PS_format_info_CTS` -LODDS transforms the odds ratio via the logarithm, which tends to normalize the statistic for rare events (:ref:`Stephenson 2000 `). However, it can take values of :math:`\pm\infty` when any of the contingency table counts is 0. LODDS is defined as :math:`\text{LODDS} = ln(OR)`. +LODDS transforms the odds ratio via the logarithm, which tends to normalize the statistic for rare events (:ref:`Stephenson, 2000 `). However, it can take values of :math:`\pm\infty` when any of the contingency table counts is 0. LODDS is defined as :math:`\text{LODDS} = ln(OR)`. Odds Ratio Skill Score (ORSS) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -269,7 +269,7 @@ ORSS is a skill score based on the odds ratio. ORSS is defined as .. math:: \text{ORSS } = \frac{OR - 1}{OR + 1}. -ORSS is sometimes also referred to as Yule's Q. (:ref:`Stephenson 2000 `). +ORSS is sometimes also referred to as Yule's Q. (:ref:`Stephenson, 2000 `). Extreme Dependency Score (EDS) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -280,7 +280,7 @@ The extreme dependency score measures the association between forecast and obser .. math:: \text{EDS } = \frac{2 ln(\frac{n_{11} + n_{01}}{T})}{ln(\frac{n_{11}}{T})} - 1. -EDS can range from -1 to 1, with 0 representing no skill. A perfect forecast would have a value of EDS = 1. EDS is independent of bias, so should be presented along with the frequency bias statistic (:ref:`Stephenson et al, 2008 `). +EDS can range from -1 to 1, with 0 representing no skill. A perfect forecast would have a value of EDS = 1. EDS is independent of bias, so should be presented along with the frequency bias statistic (:ref:`Stephenson et al., 2008 `). Extreme Dependency Index (EDI) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -324,7 +324,7 @@ Bias Adjusted Gilbert Skill Score (GSS) Called "BAGSS" in CTS output :numref:`table_PS_format_info_CTS` -BAGSS is based on the GSS, but is corrected as much as possible for forecast bias (:ref:`Brill and Mesinger, 2009`). +BAGSS is based on the GSS, but is corrected as much as possible for forecast bias (:ref:`Brill and Mesinger, 2009 `). Economic Cost Loss Relative Value (ECLV) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -344,7 +344,7 @@ For cost / loss ratio above the base rate, the ECLV is defined as: MET verification measures for continuous variables __________________________________________________ -For continuous variables, many verification measures are based on the forecast error (i.e., **f - o**). However, it also is of interest to investigate characteristics of the forecasts, and the observations, as well as their relationship. These concepts are consistent with the general framework for verification outlined by :ref:`Murphy and Winkler (1987) `. The statistics produced by MET for continuous forecasts represent this philosophy of verification, which focuses on a variety of aspects of performance rather than a single measure. +For continuous variables, many verification measures are based on the forecast error (i.e., **f - o**). However, it also is of interest to investigate characteristics of the forecasts, and the observations, as well as their relationship. These concepts are consistent with the general framework for verification outlined by :ref:`Murphy and Winkler, (1987) `. The statistics produced by MET for continuous forecasts represent this philosophy of verification, which focuses on a variety of aspects of performance rather than a single measure. The verification measures currently evaluated by the Point-Stat tool are defined and described in the subsections below. In these definitions, **f** represents the forecasts, **o** represents the observation, and **n** is the number of forecast-observation pairs. @@ -567,7 +567,7 @@ Partial Sums lines (SL1L2, SAL1L2, VL1L2, VAL1L2) The SL1L2, SAL1L2, VL1L2, and VAL1L2 line types are used to store data summaries (e.g. partial sums) that can later be accumulated into verification statistics. These are divided according to scalar or vector summaries (S or V). The climate anomaly values (A) can be stored in place of the actuals, which is just a re-centering of the values around the climatological average. L1 and L2 refer to the L1 and L2 norms, the distance metrics commonly referred to as the “city block” and “Euclidean” distances. The city block is the absolute value of a distance while the Euclidean distance is the square root of the squared distance. -The partial sums can be accumulated over individual cases to produce statistics for a longer period without any loss of information because these sums are *sufficient* for resulting statistics such as RMSE, bias, correlation coefficient, and MAE (:ref:`Mood et al, 1974 `). Thus, the individual errors need not be stored, all of the information relevant to calculation of statistics are contained in the sums. As an example, the sum of all data points and the sum of all squared data points (or equivalently, the sample mean and sample variance) are *jointly sufficient* for estimates of the Gaussian distribution mean and variance. +The partial sums can be accumulated over individual cases to produce statistics for a longer period without any loss of information because these sums are *sufficient* for resulting statistics such as RMSE, bias, correlation coefficient, and MAE (:ref:`Mood et al., 1974 `). Thus, the individual errors need not be stored, all of the information relevant to calculation of statistics are contained in the sums. As an example, the sum of all data points and the sum of all squared data points (or equivalently, the sample mean and sample variance) are *jointly sufficient* for estimates of the Gaussian distribution mean and variance. *Minimally sufficient* statistics are those that condense the data most, with no loss of information. Statistics based on L1 and L2 norms allow for good compression of information. Statistics based on other norms, such as order statistics, do not result in good compression of information. For this reason, statistics such as RMSE are often preferred to statistics such as the median absolute deviation. The partial sums are not sufficient for order statistics, such as the median or quartiles. @@ -655,7 +655,7 @@ Gradient values Called "TOTAL", "FGBAR", "OGBAR", "MGBAR", "EGBAR", "S1", "S1_OG", and "FGOG_RATIO" in GRAD output :numref:`table_GS_format_info_GRAD` -These statistics are only computed by the Grid_Stat tool and require vectors. Here :math:`\nabla` is the gradient operator, which in this applications signifies the difference between adjacent grid points in both the grid-x and grid-y directions. TOTAL is the count of grid locations used in the calculations. The remaining measures are defined below: +These statistics are only computed by the Grid-Stat tool and require vectors. Here :math:`\nabla` is the gradient operator, which in this applications signifies the difference between adjacent grid points in both the grid-x and grid-y directions. TOTAL is the count of grid locations used in the calculations. The remaining measures are defined below: .. math:: \text{FGBAR} = \text{Mean}|\nabla f| = \frac{1}{n} \sum_{i=1}^n | \nabla f_i| @@ -797,7 +797,7 @@ Calibration Called "CALIBRATION" in PJC output :numref:`table_PS_format_info_PJC` -Calibration is the conditional probability of an event given each probability forecast category (i.e. each row in the **nx2** contingency table). This set of measures is paired with refinement in the calibration-refinement factorization discussed in :ref:`Wilks (2011) `. A well-calibrated forecast will have calibration values that are near the forecast probability. For example, a 50% probability of precipitation should ideally have a calibration value of 0.5. If the calibration value is higher, then the probability has been underestimated, and vice versa. +Calibration is the conditional probability of an event given each probability forecast category (i.e. each row in the **nx2** contingency table). This set of measures is paired with refinement in the calibration-refinement factorization discussed in :ref:`Wilks, (2011) `. A well-calibrated forecast will have calibration values that are near the forecast probability. For example, a 50% probability of precipitation should ideally have a calibration value of 0.5. If the calibration value is higher, then the probability has been underestimated, and vice versa. .. math:: \text{Calibration}(i) = \frac{n_{i1}}{n_{1.}} = \text{probability}(o_1|p_i) @@ -879,7 +879,7 @@ CRPS Called "CRPS" in ECNT output :numref:`table_ES_header_info_es_out_ECNT` -The continuous ranked probability score (CRPS) is the integral, over all possible thresholds, of the Brier scores (:ref:`Gneiting et al, 2004 `). In MET, the CRPS calculation uses a normal distribution fit to the ensemble forecasts. In many cases, use of other distributions would be better. +The continuous ranked probability score (CRPS) is the integral, over all possible thresholds, of the Brier scores (:ref:`Gneiting et al., 2004 `). In MET, the CRPS calculation uses a normal distribution fit to the ensemble forecasts. In many cases, use of other distributions would be better. WARNING: The normal distribution is probably a good fit for temperature and pressure, and possibly a not horrible fit for winds. However, the normal approximation will not work on most precipitation forecasts and may fail for many other atmospheric variables. @@ -907,7 +907,7 @@ IGN Called "IGN" in ECNT output :numref:`table_ES_header_info_es_out_ECNT` -The ignorance score (IGN) is the negative logarithm of a predictive probability density function (:ref:`Gneiting et al, 2004 `). In MET, the IGN is calculated based on a normal approximation to the forecast distribution (i.e. a normal pdf is fit to the forecast values). This approximation may not be valid, especially for discontinuous forecasts like precipitation, and also for very skewed forecasts. For a single normal distribution **N** with parameters :math:`\mu \text{ and } \sigma`, the ignorance score is +The ignorance score (IGN) is the negative logarithm of a predictive probability density function (:ref:`Gneiting et al., 2004 `). In MET, the IGN is calculated based on a normal approximation to the forecast distribution (i.e. a normal pdf is fit to the forecast values). This approximation may not be valid, especially for discontinuous forecasts like precipitation, and also for very skewed forecasts. For a single normal distribution **N** with parameters :math:`\mu \text{ and } \sigma`, the ignorance score is .. math:: \text{ign} (N( \mu, \sigma),y) = \frac{1}{2} \ln (2 \pi \sigma^2 ) + \frac{(y - \mu)^2}{\sigma^2}. @@ -975,7 +975,7 @@ The traditional contingency table statistics computed by the Grid-Stat neighborh All of these measures are defined in :numref:`categorical variables`. -In addition to these standard statistics, the neighborhood analysis provides additional continuous measures, the Fractions Brier Score and the Fractions Skill Score. For reference, the Asymptotic Fractions Skill Score and Uniform Fractions Skill Score are also calculated. These measures are defined here, but are explained in much greater detail in :ref:`Ebert (2008) ` and :ref:`Roberts and Lean 2008 `. Roberts and Lean (2008) also present an application of the methodology. +In addition to these standard statistics, the neighborhood analysis provides additional continuous measures, the Fractions Brier Score and the Fractions Skill Score. For reference, the Asymptotic Fractions Skill Score and Uniform Fractions Skill Score are also calculated. These measures are defined here, but are explained in much greater detail in :ref:`Ebert (2008) ` and :ref:`Roberts and Lean (2008) `. :ref:`Roberts and Lean (2008) ` also present an application of the methodology. Fractions Brier Score ~~~~~~~~~~~~~~~~~~~~~ @@ -1047,7 +1047,8 @@ The results of the distance map verification approaches that are included in the Baddeley's :math:`\Delta` Metric and Hausdorff Distance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Called “BADDELEY” and “HAUSDORFF” in the DMAP output :numref:`table_GS_format_info_DMAP` +Called “BADDELEY” and “HAUSDORFF” in the DMAP +output :numref:`table_GS_format_info_DMAP` The Baddeley's :math:`\Delta` Metric is given by diff --git a/met/docs/Users_Guide/appendixF.rst b/met/docs/Users_Guide/appendixF.rst index d47a04ba86..2e67fc0125 100644 --- a/met/docs/Users_Guide/appendixF.rst +++ b/met/docs/Users_Guide/appendixF.rst @@ -163,7 +163,7 @@ On the command line for any of the MET tools which will be obtaining its data fr ___________________ -Listed below is an example of running the **plot_data_plane** tool to call a Python script for data that is included with the MET release tarball. Assuming the MET executables are in your path, this example may be run from the top-level MET source code directory. +Listed below is an example of running the Plot-Data-Plane tool to call a Python script for data that is included with the MET release tarball. Assuming the MET executables are in your path, this example may be run from the top-level MET source code directory. .. code-block:: none @@ -171,7 +171,7 @@ Listed below is an example of running the **plot_data_plane** tool to call a Pyt 'name="scripts/python/read_ascii_numpy.py data/python/fcst.txt FCST";' \ -title "Python enabled plot_data_plane" -The first argument for the **plot_data_plane** tool is the gridded data file to be read. When calling a NumPy Python script, set this to the constant string PYTHON_NUMPY. The second argument is the name of the output PostScript file to be written. The third argument is a string describing the data to be plotted. When calling a Python script, set **name** to the Python script to be run along with command line arguments. Lastly, the **-title** option is used to add a title to the plot. Note that any print statements included in the Python script will be printed to the screen. The above example results in the following log messages. +The first argument for the Plot-Data-Plane tool is the gridded data file to be read. When calling a NumPy Python script, set this to the constant string PYTHON_NUMPY. The second argument is the name of the output PostScript file to be written. The third argument is a string describing the data to be plotted. When calling a Python script, set **name** to the Python script to be run along with command line arguments. Lastly, the **-title** option is used to add a title to the plot. Note that any print statements included in the Python script will be printed to the screen. The above example results in the following log messages. .. code-block:: none @@ -191,7 +191,7 @@ The first argument for the **plot_data_plane** tool is the gridded data file to The second option was added to support the use of Python embedding in tools which read multiple input files. Option 1 reads a single field of data from a single source, whereas tools like Ensemble-Stat, Series-Analysis, and MTD read data from multiple input files. While option 2 can be used in any of the MET tools, it is required for Python embedding in Ensemble-Stat, Series-Analysis, and MTD. -On the command line for any of the MET tools, specify the path to the input gridded data file(s) as the usage statement for the tool indicates. Do **not** substitute in PYTHON_NUMPY or PYTHON_XARRAY on the command line. In the config file dictionary set the **file_type** entry to either PYTHON_NUMPY or PYTHON_XARRAY to activate the Python embedding logic. Then, in the **name** entry of the config file dictionaries for the forecast or observation data, list the Python script to be run followed by any command line arguments for that script. However, in the Python command, replace the name of the input gridded data file with the constant string MET_PYTHON_INPUT_ARG. When looping over multiple input files, the MET tools will replace that constant **MET_PYTHON_INPUT_ARG** with the path to the file currently being processed. The example **plot_data_plane** command listed below yields the same result as the example shown above, but using the option 2 logic instead. +On the command line for any of the MET tools, specify the path to the input gridded data file(s) as the usage statement for the tool indicates. Do **not** substitute in PYTHON_NUMPY or PYTHON_XARRAY on the command line. In the config file dictionary set the **file_type** entry to either PYTHON_NUMPY or PYTHON_XARRAY to activate the Python embedding logic. Then, in the **name** entry of the config file dictionaries for the forecast or observation data, list the Python script to be run followed by any command line arguments for that script. However, in the Python command, replace the name of the input gridded data file with the constant string MET_PYTHON_INPUT_ARG. When looping over multiple input files, the MET tools will replace that constant **MET_PYTHON_INPUT_ARG** with the path to the file currently being processed. The example plot_data_plane command listed below yields the same result as the example shown above, but using the option 2 logic instead. The Ensemble-Stat, Series-Analysis, and MTD tools support the use of file lists on the command line, as do some other MET tools. Typically, the ASCII file list contains a list of files which actually exist on your machine and should be read as input. For Python embedding, these tools loop over the ASCII file list entries, set MET_PYTHON_INPUT_ARG to that string, and execute the Python script. This only allows a single command line argument to be passed to the Python script. However multiple arguments may be concatenated together using some delimiter, and the Python script can be defined to parse arguments using that delimiter. When file lists are constructed in this way, the entries will likely not be files which actually exist on your machine. In this case, users should place the constant string "file_list" on the first line of their ASCII file lists. This will ensure that the MET tools will parse the file list properly. diff --git a/met/docs/Users_Guide/ensemble-stat.rst b/met/docs/Users_Guide/ensemble-stat.rst index cc81dc357b..ea2a1329a6 100644 --- a/met/docs/Users_Guide/ensemble-stat.rst +++ b/met/docs/Users_Guide/ensemble-stat.rst @@ -27,16 +27,16 @@ Ensemble statistics Rank histograms and probability integral transform (PIT) histograms are used to determine if the distribution of ensemble values is the same as the distribution of observed values for any forecast field (:ref:`Hamill, 2001 `). The rank histogram is a tally of the rank of the observed value when placed in order with each of the ensemble values from the same location. If the distributions are identical, then the rank of the observation will be uniformly distributed. In other words, it will fall among the ensemble members randomly in equal likelihood. The PIT histogram applies this same concept, but transforms the actual rank into a probability to facilitate ensembles of differing sizes or with missing members. -Often, the goal of ensemble forecasting is to reproduce the distribution of observations using a set of many forecasts. In other words, the ensemble members represent the set of all possible outcomes. When this is true, the spread of the ensemble is identical to the error in the mean forecast. Though this rarely occurs in practice, the spread / skill relationship is still typically assessed for ensemble forecasts (:ref:`Barker, 1991 `; :ref:`Buizza, 1997 `). MET calculates the spread and skill in user defined categories according to :ref:`Eckel et al, 2012 `. +Often, the goal of ensemble forecasting is to reproduce the distribution of observations using a set of many forecasts. In other words, the ensemble members represent the set of all possible outcomes. When this is true, the spread of the ensemble is identical to the error in the mean forecast. Though this rarely occurs in practice, the spread / skill relationship is still typically assessed for ensemble forecasts (:ref:`Barker, 1991 `; :ref:`Buizza,1997 `). MET calculates the spread and skill in user defined categories according to :ref:`Eckel et al. (2012) `. The relative position (RELP) is a count of the number of times each ensemble member is closest to the observation. For stochastic or randomly derived ensembles, this statistic is meaningless. For specified ensemble members, however, it can assist users in determining if any ensemble member is performing consistently better or worse than the others. -The ranked probability score (RPS) is included in the Ranked Probability Score (RPS) line type. It is the mean of the Brier scores computed from ensemble probabilities derived for each probability category threshold (prob_cat_thresh) specified in the configuration file. The continuous ranked probability score (CRPS) is the average the distance between the forecast (ensemble) cumulative distribution function and the observation cumulative distribution function. It is an analog of the Brier score, but for continuous forecast and observation fields. (:ref:`Gneiting et al, 2004 `). The CRPS statistic is included in the Ensemble Continuous Statistics (ECNT) line type, along with other statistics quantifying the ensemble spread and ensemble mean skill. +The ranked probability score (RPS) is included in the Ranked Probability Score (RPS) line type. It is the mean of the Brier scores computed from ensemble probabilities derived for each probability category threshold (prob_cat_thresh) specified in the configuration file. The continuous ranked probability score (CRPS) is the average the distance between the forecast (ensemble) cumulative distribution function and the observation cumulative distribution function. It is an analog of the Brier score, but for continuous forecast and observation fields. (:ref:`Gneiting et al., 2004 `). The CRPS statistic is included in the Ensemble Continuous Statistics (ECNT) line type, along with other statistics quantifying the ensemble spread and ensemble mean skill. Ensemble observation error ~~~~~~~~~~~~~~~~~~~~~~~~~~ -In an attempt to ameliorate the effect of observation errors on the verification of forecasts, a random perturbation approach has been implemented. A great deal of user flexibility has been built in, but the methods detailed in :ref:`Candile and Talagrand (2008) `. can be replicated using the appropriate options. The user selects a distribution for the observation error, along with parameters for that distribution. Rescaling and bias correction can also be specified prior to the perturbation. Random draws from the distribution can then be added to either, or both, of the forecast and observed fields, including ensemble members. Details about the effects of the choices on verification statistics should be considered, with many details provided in the literature (*e.g.* :ref:`Candille and Talagrand, 2008 `; :ref:`Saetra et al., 2004 `; :ref:`Santos and Ghelli, 2012 `). Generally, perturbation makes verification statistics better when applied to ensemble members, and worse when applied to the observations themselves. +In an attempt to ameliorate the effect of observation errors on the verification of forecasts, a random perturbation approach has been implemented. A great deal of user flexibility has been built in, but the methods detailed in :ref:`Candille and Talagrand (2008) `. can be replicated using the appropriate options. The user selects a distribution for the observation error, along with parameters for that distribution. Rescaling and bias correction can also be specified prior to the perturbation. Random draws from the distribution can then be added to either, or both, of the forecast and observed fields, including ensemble members. Details about the effects of the choices on verification statistics should be considered, with many details provided in the literature (*e.g.* :ref:`Candille and Talagrand, 2008 `; :ref:`Saetra et al., 2004 `; :ref:`Santos and Ghelli, 2012 `). Generally, perturbation makes verification statistics better when applied to ensemble members, and worse when applied to the observations themselves. Normal and uniform are common choices for the observation error distribution. The uniform distribution provides the benefit of being bounded on both sides, thus preventing the perturbation from taking on extreme values. Normal is the most common choice for observation error. However, the user should realize that with the very large samples typical in NWP, some large outliers will almost certainly be introduced with the perturbation. For variables that are bounded below by 0, and that may have inconsistent observation errors (e.g. larger errors with larger measurements), a lognormal distribution may be selected. Wind speeds and precipitation measurements are the most common of this type of NWP variable. The lognormal error perturbation prevents measurements of 0 from being perturbed, and applies larger perturbations when measurements are larger. This is often the desired behavior in these cases, but this distribution can also lead to some outliers being introduced in the perturbation step. diff --git a/met/docs/Users_Guide/grid-diag.rst b/met/docs/Users_Guide/grid-diag.rst index da2694d319..452a9126a2 100644 --- a/met/docs/Users_Guide/grid-diag.rst +++ b/met/docs/Users_Guide/grid-diag.rst @@ -53,7 +53,7 @@ Optional arguments for grid_diag grid_diag configuration file ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The default configuration file for the Grid-Diag tool named 'GridDiagConfig_default' can be found in the installed **share/met/config/ directory**. It is encouraged for users to copy these default files before modifying their contents. The contents of the configuration file are described in the subsections below. +The default configuration file for the Grid-Diag tool named **GridDiagConfig_default** can be found in the installed *share/met/config/* directory. It is encouraged for users to copy these default files before modifying their contents. The contents of the configuration file are described in the subsections below. _____________________ diff --git a/met/docs/Users_Guide/grid-stat.rst b/met/docs/Users_Guide/grid-stat.rst index d43555030b..0e61811576 100644 --- a/met/docs/Users_Guide/grid-stat.rst +++ b/met/docs/Users_Guide/grid-stat.rst @@ -55,7 +55,7 @@ The Grid-Stat tool allows evaluation of model forecasts using model analysis fie Statistical confidence intervals ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The confidence intervals for the Grid-Stat tool are the same as those provided for the Point-Stat tool except that the scores are based on pairing grid points with grid points so that there are likely more values for each field making any assumptions based on the central limit theorem more likely to be valid. However, it should be noted that spatial (and temporal) correlations are not presently taken into account in the confidence interval calculations. Therefore, confidence intervals reported may be somewhat too narrow (e.g., :ref:`Efron 2007 `). See :numref:`Appendix D, Section %s ` for details regarding confidence intervals provided by MET. +The confidence intervals for the Grid-Stat tool are the same as those provided for the Point-Stat tool except that the scores are based on pairing grid points with grid points so that there are likely more values for each field making any assumptions based on the central limit theorem more likely to be valid. However, it should be noted that spatial (and temporal) correlations are not presently taken into account in the confidence interval calculations. Therefore, confidence intervals reported may be somewhat too narrow (e.g., :ref:`Efron, 2007 `). See :numref:`Appendix D, Section %s ` for details regarding confidence intervals provided by MET. Grid weighting ~~~~~~~~~~~~~~ @@ -78,7 +78,7 @@ The MET software will compute the full one-dimensional Fourier transform, then d Decomposition via Fourier transform allows the user to evaluate the model separately at each spatial frequency. As an example, the Fourier analysis allows users to examine the "dieoff", or reduction, in anomaly correlation of geopotential height at various levels for bands of waves. A band of low wave numbers, say 0 - 3, represent larger frequency components, while a band of higher wave numbers, for example 70 - 72, represent smaller frequency components. Generally, anomaly correlation should be higher for frequencies with low wave numbers than for frequencies with high wave numbers, hence the "dieoff". -Wavelets, and in particular the MET wavelet tool, can also be used to define a band pass filter (:ref:`Casati et al, 2004 `; :ref:`Weniger et al 2016 `). Both the Fourier and wavelet methods can be used to look at different spatial scales. +Wavelets, and in particular the MET wavelet tool, can also be used to define a band pass filter (:ref:`Casati et al., 2004 `; :ref:`Weniger et al., 2016 `). Both the Fourier and wavelet methods can be used to look at different spatial scales. Gradient Statistics ~~~~~~~~~~~~~~~~~~~ @@ -196,7 +196,7 @@ In the second example, the Grid-Stat tool will verify the model data in the samp grid_stat configuration file ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The default configuration file for the Grid-Stat tool, named GridStatConfig_default, can be found in the installed *share/met/config* directory. Other versions of the configuration file are included in *scripts/config*. We recommend that users make a copy of the default (or other) configuration file prior to modifying it. The contents are described in more detail below. +The default configuration file for the Grid-Stat tool, named **GridStatConfig_default**, can be found in the installed *share/met/config* directory. Other versions of the configuration file are included in *scripts/config*. We recommend that users make a copy of the default (or other) configuration file prior to modifying it. The contents are described in more detail below. Note that environment variables may be used when editing configuration files, as described in :numref:`pb2nc configuration file` for the PB2NC tool. diff --git a/met/docs/Users_Guide/gsi-tools.rst b/met/docs/Users_Guide/gsi-tools.rst index 1bf6af51d9..337596fdf0 100644 --- a/met/docs/Users_Guide/gsi-tools.rst +++ b/met/docs/Users_Guide/gsi-tools.rst @@ -9,17 +9,17 @@ For more detail on generating GSI diagnostic files and their contents, see the ` When MET reads GSI diagnostic files, the innovation (O-B; generated prior to the first outer loop) or analysis increment (O-A; generated after the final outer loop) is split into separate values for the observation (OBS) and the forecast (FCST), where the forecast value corresponds to the background (O-B) or analysis (O-A). -MET includes two tools for processing GSI diagnostic files. The gsid2mpr tool reformats individual GSI diagnostic files into the MET matched pair (MPR) format, similar to the output of the Point-Stat tool. The gsidens2orank tool processes an ensemble of GSI diagnostic files and reformats them into the MET observation rank (ORANK) line type, similar to the output of the Ensemble-Stat tool. The output of both tools may be passed to the Stat-Analysis tool to compute a wide variety of continuous, categorical, and ensemble statistics. +MET includes two tools for processing GSI diagnostic files. The GSID2MPR tool reformats individual GSI diagnostic files into the MET matched pair (MPR) format, similar to the output of the Point-Stat tool. The GSIDENS2ORANK tool processes an ensemble of GSI diagnostic files and reformats them into the MET observation rank (ORANK) line type, similar to the output of the Ensemble-Stat tool. The output of both tools may be passed to the Stat-Analysis tool to compute a wide variety of continuous, categorical, and ensemble statistics. GSID2MPR tool _____________ -This section describes how to run the tool gsid2mpr tool. The gsid2mpr tool reformats one or more GSI diagnostic files into an ASCII matched pair (MPR) format, similar to the MPR output of the Point-Stat tool. The output MPR data may be passed to the Stat-Analysis tool to compute a wide variety of continuous or categorical statistics. +This section describes how to run the tool GSID2MPR tool. The GSID2MPR tool reformats one or more GSI diagnostic files into an ASCII matched pair (MPR) format, similar to the MPR output of the Point-Stat tool. The output MPR data may be passed to the Stat-Analysis tool to compute a wide variety of continuous or categorical statistics. gsid2mpr usage ~~~~~~~~~~~~~~ -The usage statement for the gsid2mpr tool is shown below: +The usage statement for the GSID2MPR tool is shown below: .. code-block:: none @@ -34,7 +34,7 @@ The usage statement for the gsid2mpr tool is shown below: [-log file] [-v level] -gsid2mpr has one required argument and and accepts several optional ones. +gsid2mpr has one required argument and accepts several optional ones. Required arguments for gsid2mpr ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -68,14 +68,14 @@ An example of the gsid2mpr calling sequence is shown below: -set_hdr MODEL GSI_MEM001 \ -outdir out -In this example, the gsid2mpr tool will process a single input file named **diag_conv_ges.mem001** file, set the output **MODEL** header column to **GSI_MEM001**, and write output to the **out** directory. The output file is named the same as the input file but a **.stat** suffix is added to indicate its format. +In this example, the GSID2MPR tool will process a single input file named **diag_conv_ges.mem001** file, set the output **MODEL** header column to **GSI_MEM001**, and write output to the **out** directory. The output file is named the same as the input file but a **.stat** suffix is added to indicate its format. gsid2mpr output ~~~~~~~~~~~~~~~ -The gsid2mpr tool performs a simple reformatting step and thus requires no configuration file. It can read both conventional and radiance binary GSI diagnostic files. Support for additional GSI diagnostic file type may be added in future releases. Conventional files are determined by the presence of the string **conv** in the filename. Files that are not conventional are assumed to contain radiance data. Multiple files of either type may be passed in a single call to the gsid2mpr tool. For each input file, an output file will be generated containing the corresponding matched pair data. +The GSID2MPR tool performs a simple reformatting step and thus requires no configuration file. It can read both conventional and radiance binary GSI diagnostic files. Support for additional GSI diagnostic file type may be added in future releases. Conventional files are determined by the presence of the string **conv** in the filename. Files that are not conventional are assumed to contain radiance data. Multiple files of either type may be passed in a single call to the GSID2MPR tool. For each input file, an output file will be generated containing the corresponding matched pair data. -The gsid2mpr tool writes the same set of MPR output columns for the conventional and radiance data types. However, it also writes additional columns at the end of the MPR line which depend on the input file type. Those additional columns are described in the following tables. +The GSID2MPR tool writes the same set of MPR output columns for the conventional and radiance data types. However, it also writes additional columns at the end of the MPR line which depend on the input file type. Those additional columns are described in the following tables. .. list-table:: Format information for GSI Diagnostic Conventional MPR (Matched Pair) output line type. @@ -245,12 +245,12 @@ In this example, the Stat-Analysis tool will read MPR lines from the input file GSIDENS2ORANK tool __________________ -This section describes how to run the gsidens2orank tool. The gsidens2orank tool processes an ensemble of GSI diagnostic files and reformats them into the MET observation rank (ORANK) line type, similar to the output of the Ensemble-Stat tool. The ORANK line type contains ensemble matched pair information and is analogous to the MPR line type for a deterministic model. The output ORANK data may be passed to the Stat-Analysis tool to compute ensemble statistics. +This section describes how to run the GSIDENS2ORANK tool. The GSIDENS2ORANK tool processes an ensemble of GSI diagnostic files and reformats them into the MET observation rank (ORANK) line type, similar to the output of the Ensemble-Stat tool. The ORANK line type contains ensemble matched pair information and is analogous to the MPR line type for a deterministic model. The output ORANK data may be passed to the Stat-Analysis tool to compute ensemble statistics. gsidens2orank usage ~~~~~~~~~~~~~~~~~~~ -The usage statement for the gsidens2orank tool is shown below: +The usage statement for the GSIDENS2ORANK tool is shown below: .. code-block:: none @@ -303,14 +303,14 @@ An example of the gsidens2orank calling sequence is shown below: -ens_mean diag_conv_ges.ensmean \ -out diag_conv_ges_ens_mean_orank.txt -In this example, the gsidens2orank tool will process all of the ensemble members whose file name **matches diag_conv_ges.mem\*,** write output to the file named **diag_conv_ges_ens_mean_orank.txt**, and populate the output **ENS_MEAN** column with the values found in the **diag_conv_ges.ensmean** file rather than computing the ensemble mean values from the ensemble members on the fly. +In this example, the GSIDENS2ORANK tool will process all of the ensemble members whose file name **matches diag_conv_ges.mem\*,** write output to the file named **diag_conv_ges_ens_mean_orank.txt**, and populate the output **ENS_MEAN** column with the values found in the **diag_conv_ges.ensmean** file rather than computing the ensemble mean values from the ensemble members on the fly. gsidens2orank output ~~~~~~~~~~~~~~~~~~~~ -The gsidens2orank tool performs a simple reformatting step and thus requires no configuration file. The multiple files passed to it are interpreted as members of the same ensemble. Therefore, each call to the tool processes exactly one ensemble. All input ensemble GSI diagnostic files must be of the same type. Mixing conventional and radiance files together will result in a runtime error. The gsidens2orank tool processes each ensemble member and keeps track of the observations it encounters. It constructs a list of the ensemble values corresponding to each observation and writes an output ORANK line listing the observation value, its rank, and all the ensemble values. The random number generator is used by the gsidens2orank tool to randomly assign a rank value in the case of ties. +The GSIDENS2ORANK tool performs a simple reformatting step and thus requires no configuration file. The multiple files passed to it are interpreted as members of the same ensemble. Therefore, each call to the tool processes exactly one ensemble. All input ensemble GSI diagnostic files must be of the same type. Mixing conventional and radiance files together will result in a runtime error. The GSIDENS2ORANK tool processes each ensemble member and keeps track of the observations it encounters. It constructs a list of the ensemble values corresponding to each observation and writes an output ORANK line listing the observation value, its rank, and all the ensemble values. The random number generator is used by the GSIDENS2ORANK tool to randomly assign a rank value in the case of ties. -The gsid2mpr tool writes the same set of ORANK output columns for the conventional and radiance data types. However, it also writes additional columns at the end of the ORANK line which depend on the input file type. The extra columns are limited to quantities which remain constant over all the ensemble members and are therefore largely a subset of the extra columns written by the gsid2mpr tool. Those additional columns are described in the following tables. +The GSID2MPR tool writes the same set of ORANK output columns for the conventional and radiance data types. However, it also writes additional columns at the end of the ORANK line which depend on the input file type. The extra columns are limited to quantities which remain constant over all the ensemble members and are therefore largely a subset of the extra columns written by the GSID2MPR tool. Those additional columns are described in the following tables. .. list-table:: Format information for GSI Diagnostic Conventional ORANK (Observation Rank) output line type. :widths: auto diff --git a/met/docs/Users_Guide/mode-analysis.rst b/met/docs/Users_Guide/mode-analysis.rst index 6f695f4f14..53852534e3 100644 --- a/met/docs/Users_Guide/mode-analysis.rst +++ b/met/docs/Users_Guide/mode-analysis.rst @@ -20,7 +20,7 @@ The other option for operating the analysis tool is “bycase”. Given initial Practical information _____________________ -The MODE-Analysis tool reads lines from MODE ASCII output files and applies filtering and computes basic statistics on the object attribute values. For each job type, filter parameters can be set to determine which MODE output lines are used. The following sections describe the **mode_analysis** usage statement, required arguments, and optional arguments. +The MODE-Analysis tool reads lines from MODE ASCII output files and applies filtering and computes basic statistics on the object attribute values. For each job type, filter parameters can be set to determine which MODE output lines are used. The following sections describe the mode_analysis usage statement, required arguments, and optional arguments. .. _mode_analysis-usage: @@ -59,7 +59,7 @@ Specifying **-bycase** will produce a table of metrics for each case undergoing Optional arguments for mode_analysis ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -3. The **mode_analysis** options are described in the following section. These are divided into sub-sections describing the analysis options and mode line options. +3. The mode_analysis options are described in the following section. These are divided into sub-sections describing the analysis options and mode line options. Analysis options ^^^^^^^^^^^^^^^^ diff --git a/met/docs/Users_Guide/overview.rst b/met/docs/Users_Guide/overview.rst index e1e02390b5..9450a2e3af 100644 --- a/met/docs/Users_Guide/overview.rst +++ b/met/docs/Users_Guide/overview.rst @@ -47,7 +47,7 @@ Several optional plotting utilities are provided to assist users in checking the The main statistical analysis components of the current version of MET are: Point-Stat, Grid-Stat, Series-Analysis, Ensemble-Stat, MODE, MODE-TD (MTD), Grid-Diag, and Wavelet-Stat. The Point-Stat tool is used for grid-to-point verification, or verification of a gridded forecast field against a point-based observation (i.e., surface observing stations, ACARS, rawinsondes, and other observation types that could be described as a point observation). In addition to providing traditional forecast verification scores for both continuous and categorical variables, confidence intervals are also produced using parametric and non-parametric methods. Confidence intervals take into account the uncertainty associated with verification statistics due to sampling variability and limitations in sample size. These intervals provide more meaningful information about forecast performance. For example, confidence intervals allow credible comparisons of performance between two models when a limited number of model runs is available. -Sometimes it may be useful to verify a forecast against gridded fields (e.g., Stage IV precipitation analyses). The Grid-Stat tool produces traditional verification statistics when a gridded field is used as the observational dataset. Like the Point-Stat tool, the Grid-Stat tool also produces confidence intervals. The Grid-Stat tool also includes "neighborhood" spatial methods, such as the Fractional Skill Score (:ref:`Roberts and Lean 2008 `). These methods are discussed in :ref:`Ebert (2008) `. The Grid-Stat tool accumulates statistics over the entire domain. +Sometimes it may be useful to verify a forecast against gridded fields (e.g., Stage IV precipitation analyses). The Grid-Stat tool produces traditional verification statistics when a gridded field is used as the observational dataset. Like the Point-Stat tool, the Grid-Stat tool also produces confidence intervals. The Grid-Stat tool also includes "neighborhood" spatial methods, such as the Fractional Skill Score (:ref:`Roberts and Lean, 2008 `). These methods are discussed in :ref:`Ebert (2008) `. The Grid-Stat tool accumulates statistics over the entire domain. Users wishing to accumulate statistics over a time, height, or other series separately for each grid location should use the Series-Analysis tool. Series-Analysis can read any gridded matched pair data produced by the other MET tools and accumulate them, keeping each spatial location separate. Maps of these statistics can be useful for diagnosing spatial differences in forecast quality. diff --git a/met/docs/Users_Guide/plotting.rst b/met/docs/Users_Guide/plotting.rst index c102e6940f..cfd796508c 100644 --- a/met/docs/Users_Guide/plotting.rst +++ b/met/docs/Users_Guide/plotting.rst @@ -6,12 +6,12 @@ Plotting and Graphics Support Plotting Utilities __________________ -This section describes how to check your data files using plotting utilities. Point observations can be plotted using the plot_point_obs utility. A single model level can be plotted using the plot_data_plane utility. For object based evaluations, the MODE objects can be plotted using plot_mode_field. Occasionally, a post-processing or timing error can lead to errors in MET. These tools can assist the user by showing the data to be verified to ensure that times and locations match up as expected. +This section describes how to check your data files using plotting utilities. Point observations can be plotted using the Plot-Point-Obs utility. A single model level can be plotted using the plot_data_plane utility. For object based evaluations, the MODE objects can be plotted using plot_mode_field. Occasionally, a post-processing or timing error can lead to errors in MET. These tools can assist the user by showing the data to be verified to ensure that times and locations match up as expected. plot_point_obs usage ~~~~~~~~~~~~~~~~~~~~ -The usage statement for the plot_point_obs utility is shown below: +The usage statement for the Plot-Point-Obs utility is shown below: .. code-block:: none @@ -64,7 +64,7 @@ An example of the plot_point_obs calling sequence is shown below: plot_point_obs sample_pb.nc sample_data.ps -In this example, the plot_point_obs tool will process the input sample_pb.nc file and write a postscript file containing a plot to a file named sample_pb.ps. +In this example, the Plot-Point-Obs tool will process the input sample_pb.nc file and write a postscript file containing a plot to a file named sample_pb.ps. plot_point_obs configuration file ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -210,7 +210,7 @@ The usage statement for the plot_data_plane utility is shown below: [-log file] [-v level] -**plot_data_plane** has two required arguments and can take optional ones. +plot_data_plane has two required arguments and can take optional ones. Required arguments for plot_data_plane ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -234,13 +234,13 @@ Optional arguments for plot_data_plane 8. The **-v level** option indicates the desired level of verbosity. The value of "level" will override the default setting of 2. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity will increase the amount of logging. -An example of the **plot_data_plane** calling sequence is shown below: +An example of the plot_data_plane calling sequence is shown below: .. code-block:: none plot_data_plane test.grb test.ps 'name="TMP"; level="Z2";' -A second example of the **plot_data_plane** calling sequence is shown below: +A second example of the plot_data_plane calling sequence is shown below: .. code-block:: none @@ -251,7 +251,7 @@ In the first example, the Plot-Data-Plane tool will process the input test.grb f plot_mode_field usage ~~~~~~~~~~~~~~~~~~~~~ -The usage statement for the **plot_mode_field** utility is shown below: +The usage statement for the plot_mode_field utility is shown below: .. code-block:: none @@ -263,7 +263,7 @@ The usage statement for the **plot_mode_field** utility is shown below: [-log file] [-v level] -**plot_mode_field** has four required arguments and can take optional ones. +plot_mode_field has four required arguments and can take optional ones. Required arguments for plot_mode_field ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -283,7 +283,7 @@ Optional arguments for plot_mode_field 6. The **-v level** option indicates the desired level of verbosity. The value of "level" will override the default. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity will increase the amount of logging. -An example of the **plot_mode_field** calling sequence is shown below: +An example of the plot_mode_field calling sequence is shown below: .. code-block:: none diff --git a/met/docs/Users_Guide/reformat_grid.rst b/met/docs/Users_Guide/reformat_grid.rst index 9518cb0121..c2c5fe36ba 100644 --- a/met/docs/Users_Guide/reformat_grid.rst +++ b/met/docs/Users_Guide/reformat_grid.rst @@ -521,7 +521,7 @@ ____________________________ regrid = { ... } -See the **regrid entry** in :numref:Configuration File Details` for a description of the configuration file entries that control regridding. +See the **regrid entry** in :numref:`Configuration File Details` for a description of the configuration file entries that control regridding. ____________________________ diff --git a/met/docs/Users_Guide/reformat_point.rst b/met/docs/Users_Guide/reformat_point.rst index 4aa93baa23..50c30776c4 100644 --- a/met/docs/Users_Guide/reformat_point.rst +++ b/met/docs/Users_Guide/reformat_point.rst @@ -878,11 +878,9 @@ Required arguments for point2grid 2. The **to_grid** argument defines the output grid as: (1) a named grid, (2) the path to a gridded data file, or (3) an explicit grid specification string. - 3. The **output_filename** argument is the name of the output NetCDF file to be written. - -4. The **-field string** argument is a string that defines the data to be regridded. It may be used multiple times. If **-adp** option is given (for AOD data from GOES16/17), the name consists with the variable name from the input data file and the variable name from ADP data file (for example, “AOD_Smoke” or “AOD_Dust”: getting AOD variable from the input data and applying smoke or dust variable from ADP data file). +4. The **-field** string argument is a string that defines the data to be regridded. It may be used multiple times. If **-adp** option is given (for AOD data from GOES16/17), the name consists with the variable name from the input data file and the variable name from ADP data file (for example, “AOD_Smoke” or “AOD_Dust”: getting AOD variable from the input data and applying smoke or dust variable from ADP data file). Optional arguments for point2grid @@ -898,25 +896,18 @@ Optional arguments for point2grid 9. The **-gaussian_dx n** option defines the distance interval for Gaussian smoothing. The default is 81.271 km. Ignored if the method is not GAUSSIAN or MAXGAUSS. - 10. The **-gaussian_radius** n option defines the radius of influence for Gaussian interpolation. The default is 120. Ignored if the method is not GAUSSIAN or MAXGAUSS. - -11.The **-prob_cat_thresh string** option sets the threshold to compute the probability of occurrence. The default is set to disabled. This option is relevant when calculating practically perfect forecasts. - +11. The **-prob_cat_thresh string** option sets the threshold to compute the probability of occurrence. The default is set to disabled. This option is relevant when calculating practically perfect forecasts. 12. The **-vld_thresh n** option sets the required ratio of valid data for regridding. The default is 0.5. - 13. The **-name list** option specifies a comma-separated list of output variable names for each field specified. - 14. The **-log file** option directs output and errors to the specified log file. All messages will be written to that file as well as standard out and error. Thus, users can save the messages without having to redirect the output on the command line. The default behavior is no log file. - 15. The **-v level** option indicates the desired level of verbosity. The value of “level” will override the default setting of 2. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity above 1 will increase the amount of logging. - 16. The **-compress level** option indicates the desired level of compression (deflate level) for NetCDF variables. The valid level is between 0 and 9. The value of “level” will override the default setting of 0 from the configuration file or the environment variable MET_NC_COMPRESS. Setting the compression level to 0 will make no compression for the NetCDF output. Lower number is for fast compression and higher number is for better compression. Only 4 interpolation methods are applied to the field variables; MIN/MAX/MEDIAN/UW_MEAN. The GAUSSIAN method is applied to the probability variable only. Unlike regrad_data_plane, MAX method is applied to the file variable and Gaussian method to the probability variable with the MAXGAUSS method. If the probability variable is not requested, MAXGAUSS method is the same as MAX method. diff --git a/met/docs/Users_Guide/release-notes.rst b/met/docs/Users_Guide/release-notes.rst index eaff37e90f..ca389d25bd 100644 --- a/met/docs/Users_Guide/release-notes.rst +++ b/met/docs/Users_Guide/release-notes.rst @@ -10,27 +10,41 @@ Version |version| release notes (|release_date|) Version 10.0_beta1 release notes (20201022) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -- Bugfixes since the version 9.1 release: - - Clarify madis2nc error messages (`#1409 `_). - - Fix tc_gen lead window filtering option (`#1465 `_). - - Clarify error messages for Xarray python embedding (`#1472 `_). - - Add support for Gaussian grids with python embedding (`#1477 `_). - - Fix ASCII file list parsing logic (`#1484 `_ and `#1508 `_). - -- Repository and build: - - Migrate GitHub respository from the NCAR to DTCenter organization (`#1462 `_). - - Add a GitHub pull request template (`#1516 `_). - - Resolve warnings from autoconf (`#1498 `_). - - Restructure nightly builds (`#1510 `_). - -- Documentation: - - Enhance and update documentation (`#1459 `_ and `#1460 `_). - -- Library code: - - Refine log messages when verifying probabilities (`#1502 `_). - - Enhance NetCDF library code to support additional data types (`#1492 `_ and `#1493 `_). - -- Application code: - - Update point_stat log messages (`#1514 `_). - - Enhance point2grid to support additional NetCDF point observation data sources (`#1345 `_, `#1509 `_, and `#1511 `_). +* Bugfixes since the version 9.1 release: + + * Clarify madis2nc error messages (`#1409 `_). + + * Fix tc_gen lead window filtering option (`#1465 `_). + + * Clarify error messages for Xarray python embedding (`#1472 `_). + + * Add support for Gaussian grids with python embedding (`#1477 `_). + + * Fix ASCII file list parsing logic (`#1484 `_ and `#1508 `_). + +* Repository and build: + + * Migrate GitHub respository from the NCAR to DTCenter organization (`#1462 `_). + + * Add a GitHub pull request template (`#1516 `_). + + * Resolve warnings from autoconf (`#1498 `_). + + * Restructure nightly builds (`#1510 `_). + +* Documentation: + + * Enhance and update documentation (`#1459 `_ and `#1460 `_). + +* Library code: + + * Refine log messages when verifying probabilities (`#1502 `_). + + * Enhance NetCDF library code to support additional data types (`#1492 `_ and `#1493 `_). + +* Application code: + + * Update point_stat log messages (`#1514 `_). + + * Enhance point2grid to support additional NetCDF point observation data sources (`#1345 `_, `#1509 `_, and `#1511 `_). diff --git a/met/docs/Users_Guide/series-analysis.rst b/met/docs/Users_Guide/series-analysis.rst index 61bbb3ada9..bed3367d58 100644 --- a/met/docs/Users_Guide/series-analysis.rst +++ b/met/docs/Users_Guide/series-analysis.rst @@ -90,7 +90,7 @@ The Series-Analysis tool produces NetCDF files containing output statistics for series_analysis configuration file ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The default configuration file for the Series-Analysis tool named *SeriesAnalysisConfig_default* can be found in the installed *share/met/config* directory. The contents of the configuration file are described in the subsections below. +The default configuration file for the Series-Analysis tool named **SeriesAnalysisConfig_default** can be found in the installed *share/met/config* directory. The contents of the configuration file are described in the subsections below. Note that environment variables may be used when editing configuration files, as described in the :numref:`pb2nc configuration file` for the PB2NC tool. diff --git a/met/docs/Users_Guide/stat-analysis.rst b/met/docs/Users_Guide/stat-analysis.rst index e54d7c0ca2..9f997dcaae 100644 --- a/met/docs/Users_Guide/stat-analysis.rst +++ b/met/docs/Users_Guide/stat-analysis.rst @@ -218,7 +218,7 @@ Optional arguments for stat_analysis 7. The **-v level** indicates the desired level of verbosity. The contents of "level" will override the default setting of 2. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity will increase the amount of logging. -An example of the **stat_analysis** calling sequence is shown below. +An example of the stat_analysis calling sequence is shown below. .. code-block:: none diff --git a/met/docs/Users_Guide/tc-pairs.rst b/met/docs/Users_Guide/tc-pairs.rst index 3a282e8add..24ec641378 100644 --- a/met/docs/Users_Guide/tc-pairs.rst +++ b/met/docs/Users_Guide/tc-pairs.rst @@ -168,7 +168,7 @@ The following are valid baselines for the **best_baseline** field: **BTCLIP**: Neumann original 3-day CLIPER in best track mode. Used for the Atlantic basin only. Specify model as BCLP. -**BTCLIP5**: 5-day CLIPER (:ref:`Aberson, 1998 `)/SHIFOR (:ref:`DeMaria and Knaff, 2003 ` in best track mode for either Atlantic or eastern North Pacific basins. Specify model as BCS5. +**BTCLIP5**: 5-day CLIPER (:ref:`Aberson, 1998 `)/SHIFOR (:ref:`DeMaria and Knaff, 2003 `) in best track mode for either Atlantic or eastern North Pacific basins. Specify model as BCS5. **BTCLIPA**: Sim Aberson's recreation of Neumann original 3-day CLIPER in best-track mode. Used for Atlantic basin only. Specify model as BCLA. diff --git a/met/docs/Users_Guide/wavelet-stat.rst b/met/docs/Users_Guide/wavelet-stat.rst index 947f7fc392..3f110bab99 100644 --- a/met/docs/Users_Guide/wavelet-stat.rst +++ b/met/docs/Users_Guide/wavelet-stat.rst @@ -8,7 +8,7 @@ Wavelet-Stat Tool Introduction ____________ -The Wavelet-Stat tool decomposes two-dimensional forecasts and observations according to intensity and scale. This section describes the Wavelet-Stat tool, which enables users to apply the Intensity-Scale verification technique described by :ref:`Casati et al (2004) `. +The Wavelet-Stat tool decomposes two-dimensional forecasts and observations according to intensity and scale. This section describes the Wavelet-Stat tool, which enables users to apply the Intensity-Scale verification technique described by :ref:`Casati et al. (2004) `. The Intensity-Scale technique is one of the recently developed verification approaches that focus on verification of forecasts defined over spatial domains. Spatial verification approaches, as opposed to point-by-point verification approaches, aim to account for the presence of features and for the coherent spatial structure characterizing meteorological fields. Since these approaches account for the intrinsic spatial correlation existing between nearby grid-points, they do not suffer from point-by-point comparison related verification issues, such as double penalties. Spatial verification approaches aim to account for the observation and forecast time-space uncertainties, and aim to provide feedback on the forecast error in physical terms. @@ -26,7 +26,7 @@ __________________________________ The method ~~~~~~~~~~ -:ref:`Casati et al (2004) ` applied the Intensity-Scale verification to preprocessed and re-calibrated (unbiased) data. The preprocessing was aimed to mainly normalize the data, and defined categorical thresholds so that each categorical bin had a similar sample size. The recalibration was performed to eliminate the forecast bias. Preprocessing and recalibration are not strictly necessary for the application of the Intensity-Scale technique. The MET Intensity-Scale Tool does not perform either, and applies the Intensity-Scale approach to biased forecasts, for categorical thresholds defined by the user. +:ref:`Casati et al. (2004) ` applied the Intensity-Scale verification to preprocessed and re-calibrated (unbiased) data. The preprocessing was aimed to mainly normalize the data, and defined categorical thresholds so that each categorical bin had a similar sample size. The recalibration was performed to eliminate the forecast bias. Preprocessing and recalibration are not strictly necessary for the application of the Intensity-Scale technique. The MET Intensity-Scale Tool does not perform either, and applies the Intensity-Scale approach to biased forecasts, for categorical thresholds defined by the user. The Intensity Scale approach can be summarized in the following 5 steps: @@ -42,9 +42,9 @@ The Intensity Scale approach can be summarized in the following 5 steps: **Note** that the MSE of the original binary fields is equal to the proportion of the counts of misses (**c/n**) and false alarms (**b/n**) for the contingency table (:numref:`contingency_table_counts`) obtained from the original forecast and observation fields by thresholding with the same threshold used to obtain the binary forecast and observation fields: :math:`{MSE}(t)=(b+c)/n`. This relation is intuitive when comparing the forecast and observation binary field difference and their corresponding contingency table image (:numref:`contingency_table_counts`). -4. The MSE for the random binary forecast and observation fields is estimated by :math:`{MSE}(t) {random}= {FBI}*{Br}*(1-{Br}) + {Br}*(1- {FBI}*{Br})`, where :math:`{FBI}=(a+b)/(a+c)` is the frequency bias index and :math:`{Br}=(a+c)/n` is the sample climatology from the contingency table (:numref:`contingency_table_counts`) obtained from the original forecast and observation fields by thresholding with the same threshold used to obtain the binary forecast and observation fields. This formula follows by considering the :ref:`Murphy and Winkler (1987) ` framework, applying the Bayes' theorem to express the joint probabilities **b/n** and **c/n** as product of the marginal and conditional probability (e.g. :ref:`Jolliffe and Stephenson (2012) `; :ref:`Wilks, (2010) `), and then noticing that for a random forecast the conditional probability is equal to the unconditional one, so that **b/n** and **c/n** are equal to the product of the corresponding marginal probabilities solely. +4. The MSE for the random binary forecast and observation fields is estimated by :math:`{MSE}(t) {random}= {FBI}*{Br}*(1-{Br}) + {Br}*(1- {FBI}*{Br})`, where :math:`{FBI}=(a+b)/(a+c)` is the frequency bias index and :math:`{Br}=(a+c)/n` is the sample climatology from the contingency table (:numref:`contingency_table_counts`) obtained from the original forecast and observation fields by thresholding with the same threshold used to obtain the binary forecast and observation fields. This formula follows by considering the :ref:`Murphy and Winkler (1987) ` framework, applying the Bayes' theorem to express the joint probabilities **b/n** and **c/n** as product of the marginal and conditional probability (e.g. :ref:`Jolliffe and Stephenson, 2012 `; :ref:`Wilks, 2010 `), and then noticing that for a random forecast the conditional probability is equal to the unconditional one, so that **b/n** and **c/n** are equal to the product of the corresponding marginal probabilities solely. -5. For each threshold (**t**) and scale component (**j**), the skill score based on the MSE of binary forecast and observation scale components is evaluated (:numref:`wavelet-stat_Intensity_Scale_skill_score_NIMROD`). The standard skill score definition as in :ref:`Jolliffe and Stephenson (2012) ` or :ref:`Wilks, (2010) ` is used, and random chance is used as reference forecast. The MSE for the random binary forecast is equipartitioned on the **n+1** scales to evaluate the skill score: :math:`{SS} (t,j)=1- {MSE}(t,j)*(n+1)/ {MSE}(t) {random}` +5. For each threshold (**t**) and scale component (**j**), the skill score based on the MSE of binary forecast and observation scale components is evaluated (:numref:`wavelet-stat_Intensity_Scale_skill_score_NIMROD`). The standard skill score definition as in :ref:`Jolliffe and Stephenson (2012) ` or :ref:`Wilks (2010) ` is used, and random chance is used as reference forecast. The MSE for the random binary forecast is equipartitioned on the **n+1** scales to evaluate the skill score: :math:`{SS} (t,j)=1- {MSE}(t,j)*(n+1)/ {MSE}(t) {random}` The Intensity-Scale (IS) skill score evaluates the forecast skill as a function of the precipitation intensity and of the spatial scale of the error. Positive values of the IS skill score are associated with a skillful forecast, whereas negative values are associated with no skill. Usually large scales exhibit positive skill (large scale events, such as fronts, are well predicted), whereas small scales exhibit negative skill (small scale events, such as convective showers, are less predictable), and the smallest scale and highest thresholds exhibit the worst skill. For the NIMROD case illustrated note the negative skill associated with the 160 km scale, for the thresholds to 4 mm/h, due to the 160 km storm displaced almost its entire length. @@ -256,7 +256,7 @@ _______________________ member = 2; } -The **wavelet_flag** and **wavelet_k** variables specify the type and shape of the wavelet to be used for the scale decomposition. The :ref:`Casati et al (2004) ` method uses a Haar wavelet which is a good choice for discontinuous fields like precipitation. However, users may choose to apply any wavelet family/shape that is available in the GNU Scientific Library. Values for the **wavelet_flag** variable, and associated choices for k, are described below: +The **wavelet_flag** and **wavelet_k** variables specify the type and shape of the wavelet to be used for the scale decomposition. The :ref:`Casati et al. (2004) ` method uses a Haar wavelet which is a good choice for discontinuous fields like precipitation. However, users may choose to apply any wavelet family/shape that is available in the GNU Scientific Library. Values for the **wavelet_flag** variable, and associated choices for k, are described below: • **HAAR** for the Haar wavelet (member = 2).