Skip to content

Commit

Permalink
Support freq="W" in standardized indices (#1952)
Browse files Browse the repository at this point in the history
<!--Please ensure the PR fulfills the following requirements! -->
<!-- If this is your first PR, make sure to add your details to the
AUTHORS.rst! -->
### Pull Request Checklist:
- [ ] This PR addresses an already opened issue (for bug fixes /
features)
    - This PR fixes #xyz
- [ ] Tests for the changes have been added (for bug fixes / features)
- [ ] (If applicable) Documentation has been added / updated (for bug
fixes / features)
- [ ] CHANGELOG.rst has been updated (with summary of main changes)
- [ ] Link to issue (:issue:`number`) and pull request (:pull:`number`)
has been added

### What kind of change does this PR introduce?

* Add support for weekly standardized indices

### Does this PR introduce a breaking change?

No

### Other information:

I realize that xclim and climate_indices treat zero-inflated
distributions differently. I will investigate which library does it
correctly and include changes if needed

EDIT: I believe `xclim` has the correct implementation. The probability
of zeroes should be determined in the fitting procedure, not by using
the full dataset. The idea is when you compute a CDF, a zero value
should be mapped to the `prob_of_zero` in the calibration period. That
is the same logic as using the fitting params of the distribution in the
calibration period to compute the CDF in the full dataset.

I confirmed this is also how it's done in the case of the R package
(SPEI), although I find some steps a bit weird, but I'm not too familiar
with R.
  • Loading branch information
coxipi authored Oct 17, 2024
2 parents 89220d6 + fd80bdf commit 70cbb36
Show file tree
Hide file tree
Showing 4 changed files with 53 additions and 9 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ New features and enhancements
* ``xclim.indices.run_length.runs_with_holes`` allows to input a condition that must be met for a run to start and a second condition that must be met for the run to stop. (:pull:`1778`).
* New generic compute function ``xclim.indices.generic.thresholded_events`` that finds events based on a threshold condition and returns basic stats for each. See also ``xclim.indices.run_length.find_events``. (:pull:`1778`).
* ``xclim.core.units.rate2amount`` and ``xclim.core.units.amount2rate`` can now also accept quantities (pint objects or strings), in which case the ``dim`` argument must be the ``time`` coordinate through which we can find the sampling rate. (:pull:`1778`).
* ``xclim.indices.stats.standardized_index`` now supports a weekly resampling frequency. Only "standard" calendars using numpy's `datetime64` dtype are supported for this mode. (:issue:`1892`, :pull:`1952`)

Bug fixes
^^^^^^^^^
Expand Down
39 changes: 38 additions & 1 deletion tests/test_indices.py
Original file line number Diff line number Diff line change
Expand Up @@ -623,6 +623,38 @@ class TestStandardizedIndices:
[-0.24417774, -0.11404418, 0.64997039, 1.07670517, 0.6462852],
2e-2,
),
(
"W",
1,
"gamma",
"APP",
[0.64820146, 0.04991201, -1.62956493, 1.08898709, -0.01741762],
2e-2,
),
(
"W",
12,
"gamma",
"APP",
[-1.08683311, -0.47230036, -0.7884111, 0.3341876, 0.06282969],
2e-2,
),
(
"W",
1,
"gamma",
"ML",
[0.64676962, -0.06904886, -1.60493289, 1.07864037, -0.01415902],
2e-2,
),
(
"W",
12,
"gamma",
"ML",
[-1.08627775, -0.46491398, -0.77806462, 0.31759127, 0.03794528],
2e-2,
),
],
)
def test_standardized_precipitation_index(
Expand All @@ -636,8 +668,13 @@ def test_standardized_precipitation_index(
pytest.skip("Skipping SPI/ML/D on older numpy")
ds = open_dataset("sdba/CanESM2_1950-2100.nc").isel(location=1)
if freq == "D":
ds = ds.convert_calendar("366_day", missing=np.nan)
# to compare with ``climate_indices``
ds = ds.convert_calendar("366_day", missing=np.nan)
elif freq == "W":
# only standard calendar supported with freq="W"
ds = ds.convert_calendar(
"standard", missing=np.nan, align_on="year", use_cftime=False
)
pr = ds.pr.sel(time=slice("1998", "2000"))
pr_cal = ds.pr.sel(time=slice("1950", "1980"))
fitkwargs = {}
Expand Down
12 changes: 6 additions & 6 deletions xclim/indices/_agro.py
Original file line number Diff line number Diff line change
Expand Up @@ -1162,6 +1162,8 @@ def standardized_precipitation_index(
* N-month SPI / N-day SPI is determined by choosing the `window = N` and the appropriate frequency `freq`.
* Supported statistical distributions are: ["gamma", "fisk"], where "fisk" is scipy's implementation of
a log-logistic distribution
* Supported frequencies are daily ("D"), weekly ("W"), and monthly ("MS").
Weekly frequency will only work if the input array has a "standard" (non-cftime) calendar.
* If `params` is given as input, it overrides the `cal_start`, `cal_end`, `freq` and `window`, `dist` and `method` options.
* "APP" method only supports two-parameter distributions. Parameter `loc` needs to be fixed to use method `APP`.
* The standardized index is bounded by ±8.21. 8.21 is the largest standardized index as constrained by the float64 precision in
Expand Down Expand Up @@ -1204,7 +1206,7 @@ def standardized_precipitation_index(
:cite:cts:`mckee_relationship_1993`
"""
fitkwargs = fitkwargs or {}
dist_methods = {"gamma": ["ML", "APP", "PWM"], "fisk": ["ML", "APP"]}
dist_methods = {"gamma": ["ML", "APP"], "fisk": ["ML", "APP"]}
if dist in dist_methods:
if method not in dist_methods[dist]:
raise NotImplementedError(
Expand Down Expand Up @@ -1267,10 +1269,8 @@ def standardized_precipitation_evapotranspiration_index(
dist : {'gamma', 'fisk'}
Name of the univariate distribution. (see :py:mod:`scipy.stats`).
method : {'APP', 'ML'}
Name of the fitting method, such as `ML` (maximum likelihood), `APP` (approximate), or
`PWM` (probability weighted moments).
The approximate method uses a deterministic function that doesn't involve any optimization. Available methods
vary with the distribution: 'gamma':{'APP', 'ML', 'PWM'}, 'fisk':{'APP', 'ML'}
Name of the fitting method, such as `ML` (maximum likelihood), `APP` (approximate). The approximate method
uses a deterministic function that doesn't involve any optimization.
fitkwargs : dict, optional
Kwargs passed to ``xclim.indices.stats.fit`` used to impose values of certains parameters (`floc`, `fscale`).
cal_start : DateStr, optional
Expand Down Expand Up @@ -1298,7 +1298,7 @@ def standardized_precipitation_evapotranspiration_index(
"""
fitkwargs = fitkwargs or {}

dist_methods = {"gamma": ["ML", "APP", "PWM"], "fisk": ["ML", "APP"]}
dist_methods = {"gamma": ["ML", "APP"], "fisk": ["ML", "APP"]}
if dist in dist_methods:
if method not in dist_methods[dist]:
raise NotImplementedError(
Expand Down
10 changes: 8 additions & 2 deletions xclim/indices/stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -540,6 +540,8 @@ def _fit_start(x, dist: str, **fitkwargs: Any) -> tuple[tuple, dict]:
loc0 = (x1 * xn - x2**2) / (x1 + xn - 2 * x2)
loc0 = loc0 if loc0 < x1 else (0.9999 * x1 if x1 > 0 else 1.0001 * x1)
x_pos = x - loc0
# TODO: change this?
# not necessary for log-logistic, according to SPEI package
x_pos = x_pos[x_pos > 0]
# method of moments:
# LHS is computed analytically with the two-parameters log-logistic distribution
Expand Down Expand Up @@ -675,6 +677,8 @@ def preprocess_standardized_index(
group = "time.dayofyear"
elif compare_offsets(final_freq, "==", "MS"):
group = "time.month"
elif compare_offsets(final_freq, "==", "W"):
group = "time.week"
else:
raise ValueError(
f"The input (following resampling if applicable) has a frequency `{final_freq}` "
Expand Down Expand Up @@ -775,8 +779,7 @@ def standardized_index_fit_params(
"Pass a value for `floc` in `fitkwargs`."
)

# "WPM" method doesn't seem to work for gamma or pearson3
dist_and_methods = {"gamma": ["ML", "APP", "PWM"], "fisk": ["ML", "APP"]}
dist_and_methods = {"gamma": ["ML", "APP"], "fisk": ["ML", "APP"]}
dist = get_dist(dist)
if dist.name not in dist_and_methods:
raise NotImplementedError(f"The distribution `{dist.name}` is not supported.")
Expand Down Expand Up @@ -929,6 +932,9 @@ def reindex_time(da, da_ref, group):
elif group == "time.month":
da = da.rename(month="time").reindex(time=da_ref.time.dt.month)
da["time"] = da_ref.time
elif group == "time.week":
da = da.rename(week="time").reindex(time=da_ref.time.dt.week)
da["time"] = da_ref.time
# I don't think rechunking is necessary here, need to check
return da if not uses_dask(da) else da.chunk({"time": -1})

Expand Down

0 comments on commit 70cbb36

Please sign in to comment.