Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop groups associated with nans in group variable #3406

Merged
merged 11 commits into from
Oct 28, 2019

Conversation

dcherian
Copy link
Contributor

@dcherian dcherian commented Oct 16, 2019

@dcherian dcherian changed the title Drop nans in grouped variable. Drop groups associated with nans in group variable Oct 16, 2019
@max-sixty
Copy link
Collaborator

+1. This is how pandas treats NaNs

IIUC, it is a breaking change, so we'd need to leave this open or commit to 0.15 for next release

@dcherian
Copy link
Contributor Author

I don't think so. Two of the tests I added fail on master with the IndexError in #2383. The rest pass on master.

@max-sixty
Copy link
Collaborator

I don't think so. Two of the tests I added fail on master with the IndexError in #2383. The rest pass on master.

You're right!

@dcherian
Copy link
Contributor Author

Thanks for checking!

* upstream/master:
  Whatsnew for pydata#3419 (pydata#3422)
  Revert changes made in pydata#3358 (pydata#3411)
  Python3.6 idioms (pydata#3419)
  Temporarily mark pseudonetcdf-3.1 as incompatible (pydata#3420)
  Fix and add test for groupby_bins() isnan TypeError. (pydata#3405)
  Update where docstring to make return value type more clear (pydata#3408)
  tests for arrays with units (pydata#3238)
@dcherian dcherian mentioned this pull request Oct 22, 2019
12 tasks
* upstream/master:
  MAGA (Make Azure Green Again) (pydata#3436)
  Test that Dataset and DataArray resampling are identical (pydata#3412)
  Avoid multiplication DeprecationWarning in rasterio backend (pydata#3428)
  Sync with latest version of cftime (v1.0.4) (pydata#3430)
  Add cftime git tip to upstream-dev + temporarily pin cftime (pydata#3431)
* upstream/master:
  minor lint tweaks (pydata#3429)
  Hack around pydata#3440 (pydata#3442)
  Update Terminology page to account for multidimensional coordinates (pydata#3410)
  Use cftime master for upstream-dev build (pydata#3439)
@max-sixty
Copy link
Collaborator

Shall we merge?

* upstream/master:
  Another groupby.reduce bugfix. (pydata#3403)
  add icomoon license (pydata#3448)
  change ALL_DIMS to equal ellipsis (pydata#3418)
  Escaping dtypes (pydata#3444)
  Html repr (pydata#3425)
@dcherian
Copy link
Contributor Author

Fixed merge conflict.

@max-sixty max-sixty merged commit c955449 into pydata:master Oct 28, 2019
@max-sixty
Copy link
Collaborator

Thanks @dcherian !

dcherian added a commit to dcherian/xarray that referenced this pull request Oct 29, 2019
* upstream/master:
  Remove deprecated behavior from dataset.drop docstring (pydata#3451)
  jupyterlab dark theme (pydata#3443)
  Drop groups associated with nans in group variable (pydata#3406)
  Allow ellipsis (...) in transpose (pydata#3421)
  Another groupby.reduce bugfix. (pydata#3403)
  add icomoon license (pydata#3448)
dcherian added a commit to dcherian/xarray that referenced this pull request Oct 29, 2019
* upstream/master:
  upgrade black verison to 19.10b0 (pydata#3456)
  Remove outdated code related to compatibility with netcdftime (pydata#3450)
  Remove deprecated behavior from dataset.drop docstring (pydata#3451)
  jupyterlab dark theme (pydata#3443)
  Drop groups associated with nans in group variable (pydata#3406)
  Allow ellipsis (...) in transpose (pydata#3421)
  Another groupby.reduce bugfix. (pydata#3403)
  add icomoon license (pydata#3448)
  change ALL_DIMS to equal ellipsis (pydata#3418)
  Escaping dtypes (pydata#3444)
  Html repr (pydata#3425)
dcherian added a commit to dcherian/xarray that referenced this pull request Oct 30, 2019
commit 08f7f74
Merge: 53c0f4e 278d2e6
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:36:58 2019 -0600

    Merge remote-tracking branch 'upstream/master' into fix/dask-computes

    * upstream/master:
      upgrade black verison to 19.10b0 (pydata#3456)
      Remove outdated code related to compatibility with netcdftime (pydata#3450)

commit 53c0f4e
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:25:27 2019 -0600

    Add identity check to lazy_array_equiv

commit 5e742e4
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:22:15 2019 -0600

    update whats new

commit ee0d422
Merge: e99148e 74ca69a
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:18:38 2019 -0600

    Merge remote-tracking branch 'upstream/master' into fix/dask-computes

    * upstream/master:
      Remove deprecated behavior from dataset.drop docstring (pydata#3451)
      jupyterlab dark theme (pydata#3443)
      Drop groups associated with nans in group variable (pydata#3406)
      Allow ellipsis (...) in transpose (pydata#3421)
      Another groupby.reduce bugfix. (pydata#3403)
      add icomoon license (pydata#3448)

commit e99148e
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:17:58 2019 -0600

    add concat test

commit 4a66e7c
Author: dcherian <deepak@cherian.net>
Date:   Mon Oct 28 10:19:32 2019 -0600

    review suggestions.

commit 8739ddd
Author: dcherian <deepak@cherian.net>
Date:   Mon Oct 28 08:32:15 2019 -0600

    better docstring

commit e84cc97
Author: dcherian <deepak@cherian.net>
Date:   Sun Oct 27 20:22:13 2019 -0600

    Optimize dask array equality checks.

    Dask arrays with the same graph have the same name. We can use this to quickly
    compare dask-backed variables without computing.

    Fixes pydata#3068 and pydata#3311
dcherian added a commit to dcherian/xarray that referenced this pull request Oct 30, 2019
commit bc39877
Merge: 507b1f6 278d2e6
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:36:30 2019 -0600

    Merge remote-tracking branch 'upstream/master' into dask-tokenize

    * upstream/master:
      upgrade black verison to 19.10b0 (pydata#3456)
      Remove outdated code related to compatibility with netcdftime (pydata#3450)
      Remove deprecated behavior from dataset.drop docstring (pydata#3451)
      jupyterlab dark theme (pydata#3443)
      Drop groups associated with nans in group variable (pydata#3406)
      Allow ellipsis (...) in transpose (pydata#3421)
      Another groupby.reduce bugfix. (pydata#3403)
      add icomoon license (pydata#3448)
      change ALL_DIMS to equal ellipsis (pydata#3418)
      Escaping dtypes (pydata#3444)
      Html repr (pydata#3425)

commit 507b1f6
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:34:47 2019 -0600

    Fix window test

commit 4ab6a66
Author: dcherian <deepak@cherian.net>
Date:   Thu Oct 24 14:30:57 2019 -0600

    Implement __dask_tokenize__
dcherian added a commit to dcherian/xarray that referenced this pull request Nov 2, 2019
commit 0711eb0
Author: dcherian <deepak@cherian.net>
Date:   Thu Oct 31 21:18:58 2019 -0600

    bugfix.

commit 4ee2963
Author: Deepak Cherian <dcherian@users.noreply.github.com>
Date:   Thu Oct 31 11:27:05 2019 -0600

    pep8

commit 6e4c11f
Merge: 08f7f74 53c5199
Author: Deepak Cherian <dcherian@users.noreply.github.com>
Date:   Thu Oct 31 11:25:12 2019 -0600

    Merge branch 'master' into fix/dask-computes

commit 08f7f74
Merge: 53c0f4e 278d2e6
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:36:58 2019 -0600

    Merge remote-tracking branch 'upstream/master' into fix/dask-computes

    * upstream/master:
      upgrade black verison to 19.10b0 (pydata#3456)
      Remove outdated code related to compatibility with netcdftime (pydata#3450)

commit 53c0f4e
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:25:27 2019 -0600

    Add identity check to lazy_array_equiv

commit 5e742e4
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:22:15 2019 -0600

    update whats new

commit ee0d422
Merge: e99148e 74ca69a
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:18:38 2019 -0600

    Merge remote-tracking branch 'upstream/master' into fix/dask-computes

    * upstream/master:
      Remove deprecated behavior from dataset.drop docstring (pydata#3451)
      jupyterlab dark theme (pydata#3443)
      Drop groups associated with nans in group variable (pydata#3406)
      Allow ellipsis (...) in transpose (pydata#3421)
      Another groupby.reduce bugfix. (pydata#3403)
      add icomoon license (pydata#3448)

commit e99148e
Author: dcherian <deepak@cherian.net>
Date:   Tue Oct 29 09:17:58 2019 -0600

    add concat test

commit 4a66e7c
Author: dcherian <deepak@cherian.net>
Date:   Mon Oct 28 10:19:32 2019 -0600

    review suggestions.

commit 8739ddd
Author: dcherian <deepak@cherian.net>
Date:   Mon Oct 28 08:32:15 2019 -0600

    better docstring

commit e84cc97
Author: dcherian <deepak@cherian.net>
Date:   Sun Oct 27 20:22:13 2019 -0600

    Optimize dask array equality checks.

    Dask arrays with the same graph have the same name. We can use this to quickly
    compare dask-backed variables without computing.

    Fixes pydata#3068 and pydata#3311
dcherian added a commit to dcherian/xarray that referenced this pull request Nov 4, 2019
* upstream/master:
  __dask_tokenize__ (pydata#3446)
  Type check sentinel values (pydata#3472)
  Fix typo in docstring (pydata#3474)
  fix test suite warnings re `drop` (pydata#3460)
  Fix integrate docs (pydata#3469)
  Fix leap year condition in monthly means example (pydata#3464)
  Hypothesis tests for roundtrip to & from pandas (pydata#3285)
  unpin cftime (pydata#3463)
  Cleanup whatsnew (pydata#3462)
  enable xr.ALL_DIMS in xr.dot (pydata#3424)
  Merge stable into master (pydata#3457)
  upgrade black verison to 19.10b0 (pydata#3456)
  Remove outdated code related to compatibility with netcdftime (pydata#3450)
  Remove deprecated behavior from dataset.drop docstring (pydata#3451)
  jupyterlab dark theme (pydata#3443)
  Drop groups associated with nans in group variable (pydata#3406)
  Allow ellipsis (...) in transpose (pydata#3421)
  Another groupby.reduce bugfix. (pydata#3403)
  add icomoon license (pydata#3448)
@dcherian dcherian deleted the fix/groupby-nan branch January 5, 2022 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

groupby().apply() on variable with NaNs raises IndexError
2 participants