Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weighted for xr.corr #4768

Closed
aaronspring opened this issue Jan 5, 2021 · 2 comments · Fixed by #8527
Closed

weighted for xr.corr #4768

aaronspring opened this issue Jan 5, 2021 · 2 comments · Fixed by #8527

Comments

@aaronspring
Copy link
Contributor

aaronspring commented Jan 5, 2021

Is your feature request related to a problem? Please describe.
I want to make weighted correlation, e.g. spatial correlation but weighted xr.corr(fct,obs,dim=['lon','lat'], weights=np.cos(np.abs(fct.lat))) So far, xr.corr does not accept weights or input.weighted(weights). A more straightforward case would be weighting of different members: xr.corr(fct,obs,dim='member',weights=np.arange(fct.member.size))

Describe the solution you'd like
We started xskillscore https://github.com/xarray-contrib/xskillscore some time ago, before xr.corr was implemented and have keywords weighted, skipna and keep_attrs implemented. We also have xs.rmse, xs.mse, ... implemented via xr.apply_ufunc https://github.com/aaronspring/xskillscore/blob/150f7b9b2360750e6077036c7c3fd6e4439c60b6/xskillscore/core/deterministic.py#L849 which are faster than xr-based versions of mse https://github.com/aaronspring/xskillscore/blob/150f7b9b2360750e6077036c7c3fd6e4439c60b6/xskillscore/xr/deterministic.py#L6 or xr.corr, see xarray-contrib/xskillscore#231

Additional context
My question here is whether it would be better to move these xskillscore metrics upward into xarray or start a PR for weighted and skipna for xr.corr (what I prefer).

@lluritu
Copy link
Contributor

lluritu commented Dec 5, 2023

Hello, I wrote a modification of _cor_cov which allows using weights in the computation of correlation and covariance (see below).

In contrast to the version available on xskillscore, this code aligns the weights with the data, consistently with how xarray treats weights in general.
In this code I disregarded the ddof parameter, which is indeed not used in the main cor func, but it is needed for cov, so there is still a bit of work to do.

Is this something useful/interesting to merge into xarray? If so I could prepare a PR and so on.
Llorenç Lledó

def weighted_cov_corr(da_a, da_b, weights, dim = None, method = None):
    """
    Internal method for weighted cov or cor.
    """
    # 1. Broadcast the two arrays
    da_a, da_b = xr.align(da_a, da_b, join="inner", copy=False)

    # 2. Ignore the nans
    valid_values = da_a.notnull() & da_b.notnull()
    da_a = da_a.where(valid_values)
    da_b = da_b.where(valid_values)
    #valid_count = valid_values.sum(dim) - ddof

    # 3. Detrend along the given dim
    demeaned_da_a = da_a - da_a.weighted(weights).mean(dim=dim)
    demeaned_da_b = da_b - da_b.weighted(weights).mean(dim=dim)

    # 4. Compute covariance along the given dim
    # N.B. `skipna=True` is required or auto-covariance is computed incorrectly. E.g.
    # Try xr.cov(da,da) for da = xr.DataArray([[1, 2], [1, np.nan]], dims=["x", "time"])
    cov = (demeaned_da_a.conj() * demeaned_da_b).weighted(weights).mean(dim=dim, skipna=True)

    if method == "cov":
        return cov
    else:
        # compute std + corr
        da_a_std = da_a.weighted(weights).std(dim=dim)
        da_b_std = da_b.weighted(weights).std(dim=dim)
        corr = cov / (da_a_std * da_b_std)
        return corr

@dcherian
Copy link
Contributor

dcherian commented Dec 5, 2023

Thanks @lluritu a PR would be very welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants