Skip to content

Enable more ergonomic seasonal grouping and resampling #10198

Closed
@dcherian

Description

@dcherian
Contributor

TLDR

I propose merging #8524 (docs) to provide this new API (after review).

from xarray.groupers import SeasonGrouper, SeasonResampler

ds.groupby(time=SeasonGrouper(["DJFM", "MAMJ", "JJAS", "SOND"])).mean()
ds.resample(time=SeasonResampler(["DJF", "MAM", "JJA", "SON"])).mean()

Is your feature request related to a problem?

Current Status

Xarray supports a very simple form of seasonal grouping: groupby("time.season") which has a fixed definition of seasons: DJF, MAM, JJA, SON, and doesn't enforce proper ordering of the output (the seasons get sorted as a string to give : DJF, JJA, MAM, SON :/ )

We support a little more complex resampling using Pandas syntax .resample(time="QS-Jan") for example, but I think this is limited to seasons of 3 months long.

User Requests

A quick scan of issues, discussions, and StackOverflow shows that our users want more control over how seasons are specified.

  • Don't include "incomplete" seasons in output.
  • allow custom season definitions (e.g. of varying length, overlapping seasons).

Here is a list of user requests:

Describe the solution you'd like

The problem of custom seasons is simply that of converting the seasons to proper integer codes. Our relatively new Grouper objects provide this extension point.

I have implemented this in #9524 (docs). The code isn't pretty and probably doesn't scale well for very long time vectors, but I focused on correctness and tests.

Describe alternatives you've considered

This could live outside Xarray, but is such a common ask from our userbase, that it seems worth of inclusion.

Activity

trexfeathers

trexfeathers commented on Apr 3, 2025

@trexfeathers

This could live outside Xarray, but is such a common ask from our userbase, that it seems worth of inclusion.

Just in case this is useful to anyone. Not as flexible as the proposal, but available right now.

from pathlib import Path
from tempfile import TemporaryDirectory

from iris.coord_categorisation import add_season
from ncdata.iris_xarray import cubes_to_xarray, cubes_from_xarray
from ncdata.threadlock_sharing import enable_lockshare
import requests
import xarray as xr


enable_lockshare(iris=True, xarray=True)


with TemporaryDirectory() as tmpdirname:
    url = "https://github.com/pydata/xarray-data/raw/refs/heads/master/air_temperature.nc"
    response = requests.get(url)

    file_path = Path(tmpdirname) / "air_temperature.nc"

    with file_path.open("wb") as file_write:
        file_write.write(response.content)

    dataset = xr.open_dataset(file_path)

    (cube,) = cubes_from_xarray(dataset)
    # NOTE: this can't support overlapping seasons.
    add_season(cube, "time", name="season", seasons=["DJFM", "AM", "JJ", "ASON"])

    season_dataset = cubes_to_xarray(cube)
dcherian

dcherian commented on Apr 8, 2025

@dcherian
ContributorAuthor

Nice thanks @trexfeathers


Since there are 5 👍 I will open #8524 for review and bring it up for discussion at our regular meeting tomorrow.

dcherian

dcherian commented on Apr 29, 2025

@dcherian
ContributorAuthor

A couple of weks ago, we decided to merge and possibly allow providing month indices as input in the future. I will merge the PR shortly

bweeding

bweeding commented on May 9, 2025

@bweeding

Hi @dcherian, this will be fantastic for tropical/growing season analysis! Do you know when it is likely to be released as part of a package update?

Cheers
Ben

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @dcherian@TomNicholas@trexfeathers@bweeding

      Issue actions

        Enable more ergonomic seasonal grouping and resampling · Issue #10198 · pydata/xarray