Accept int value in head, thin and tail #3298

griverat · 2019-09-09T21:00:41Z

Related #3278
This PR makes the methods head, thin and tail for both DataArray and Dataset accept a single integer value as a parameter. If no parameter is given, then it defaults to 5.

Tests added
Passes black . && mypy . && flake8

max-sixty

This is great! Thanks @DangoMelon

I'll wait for a day in case of other comments

xarray/core/dataset.py

xarray/core/dataarray.py

shoyer · 2019-09-10T06:27:00Z

xarray/core/dataset.py

    ) -> "Dataset":
        """Returns a new dataset with each array indexed along every `n`th
        value for the specified dimension(s)

        Parameters
        ----------
-        indexers : dict, optional
-            A dict with keys matching dimensions and integer values `n`.
+        indexers : dict or int, default: 5


Does it make sense to thin by a factor of five by default? Or should we not have a default value? The use case for a default thinning value are less clear to me than defaults for head/tail.

I think picking a default makes it very convenient to use. And this is a convenience method...

I set a default value to thin to make it even with the other two methods (really didn't think that much of it usage) but I agree with @dcherian that it might be ok to have it for convenience.

I don't have a strong enough view. I agree it's a convenience method, but a convenient value significantly depends on the size of the array, unlike with head & tail

So whatever you think. I'm probably a -0.2

max-sixty · 2019-09-12T19:04:05Z

xarray/tests/test_dataarray.py

+        with raises_regex(TypeError, "must be an int"):
+            self.dv.tail(x=3.1)
+        with raises_regex(ValueError, "must be positive"):
+            self.dv.tail(-3)


Very thorough tests! Thank you!

max-sixty · 2019-09-12T22:53:36Z

Any other feedback before we merge? (Errors are unrelated)

shoyer · 2019-09-12T23:05:44Z

xarray/core/dataset.py

+            if not isinstance(v, int):
+                raise TypeError("indexer value must be an integer")
+            elif v < 0:
+                raise ValueError("indexer value must be positive")


It is help to include a little more context in error messages if possible. In this case, you could include offending the name and value.

Hmmm, Something along these lines maybe?

"expected integer as indexer value, found type %r for dim %r" % (type(v), k)

and

"expected positive integer as indexer value for dim %r" % k

The k and v come from iterating over indexers.items()

shoyer

Looks good to me, though it would be nice to add more context to the error messages.

griverat · 2019-09-14T03:44:34Z

Any other feedback before we merge? (Errors are unrelated)

@max-sixty I think the typing errors are due to changing the Any to int and actually finding slice.

shoyer · 2019-09-14T05:46:45Z

I would try to include value in the message, too, which can often be a useful hint about what went wrong, e.g., “expected positive integer for dimension ‘x’, got -1”

…

On Fri, Sep 13, 2019 at 8:56 PM Gerardo Rivera ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In xarray/core/dataset.py <#3298 (comment)>: > indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "head") + for v in indexers.values(): + if not isinstance(v, int): + raise TypeError("indexer value must be an integer") + elif v < 0: + raise ValueError("indexer value must be positive") Hmmm, Something along these lines maybe? "expected integer as indexer value, found type %r for dim %r" % (type(v), k) and "expected positive integer as indexer value for dim %r" % k The k and v come from iterating over indexers.items() — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3298?email_source=notifications&email_token=AAJJFVVWFOWK5LYL33Z6Q4DQJROHDA5CNFSM4IVADGGKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCEXTGGA#discussion_r324411117>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJJFVRJWPWIG5XBDWEPRMLQJROHDANCNFSM4IVADGGA> .

max-sixty · 2019-09-14T13:08:45Z

Here's a solution to the mypy issue:

diff --git a/xarray/core/dataset.py b/xarray/core/dataset.py
index 1d762114..ee4ebf4d 100644
--- a/xarray/core/dataset.py
+++ b/xarray/core/dataset.py
@@ -2046,8 +2046,8 @@ class Dataset(Mapping, ImplementsDatasetReduce, DataWithCoords):
                 raise TypeError("indexer value must be an integer")
             elif v < 0:
                 raise ValueError("indexer value must be positive")
-        indexers = {k: slice(val) for k, val in indexers.items()}
-        return self.isel(indexers)
+        indexers_slices = {k: slice(val) for k, val in indexers.items()}
+        return self.isel(indexers_slices)
 
     def tail(
         self,
@@ -2087,11 +2087,11 @@ class Dataset(Mapping, ImplementsDatasetReduce, DataWithCoords):
                 raise TypeError("indexer value must be an integer")
             elif v < 0:
                 raise ValueError("indexer value must be positive")
-        indexers = {
+        indexers_slices = {
             k: slice(-val, None) if val != 0 else slice(val)
             for k, val in indexers.items()
         }
-        return self.isel(indexers)
+        return self.isel(indexers_slices)
 
     def thin(
         self,
@@ -2134,8 +2134,8 @@ class Dataset(Mapping, ImplementsDatasetReduce, DataWithCoords):
                 raise ValueError("indexer value must be positive")
             elif v == 0:
                 raise ValueError("step cannot be zero")
-        indexers = {k: slice(None, None, val) for k, val in indexers.items()}
-        return self.isel(indexers)
+        indexers_slices = {k: slice(None, None, val) for k, val in indexers.items()}
+        return self.isel(indexers_slices)
 
     def broadcast_like(
         self, other: Union["Dataset", "DataArray"], exclude: Iterable[Hashable] = None

griverat · 2019-09-14T21:18:35Z

@shoyer I hope it's much clear now, I tried to phrase what you suggested.
@max-sixty That did the trick!
Thank you both for your suggestions.

shoyer · 2019-09-14T21:46:22Z

thanks @DangoMelon !

jhamman · 2019-09-15T07:05:58Z

@DangoMelon - thanks for your contribution. This will get used a lot!

* master: Fix whats-new date :/ Revert to dev version Release v0.13.0 auto_combine deprecation to 0.14 (pydata#3314) Deprecation: groupby, resample default dim. (pydata#3313) Raise error if cmap is list of colors (pydata#3310) Refactor concat to use merge for non-concatenated variables (pydata#3239) Honor `keep_attrs` in DataArray.quantile (pydata#3305) Fix DataArray api doc (pydata#3309) Accept int value in head, thin and tail (pydata#3298) ignore h5py 2.10.0 warnings and fix invalid_netcdf warning test. (pydata#3301) Update why-xarray.rst with clearer expression (pydata#3307) Compat and encoding deprecation to 0.14 (pydata#3294) Remove deprecated concat kwargs. (pydata#3288) allow np-array levels and colors in 2D plots (pydata#3295) Remove some deprecations (pydata#3292) Make argmin/max work lazy with dask (pydata#3244) Add head, tail and thin methods (pydata#3278) Updater to testing environment name (pydata#3253)

* upstream/master: (43 commits) Add hypothesis support to related projects (#3335) More doc fixes (#3333) Improve the documentation of swap_dims (#3331) fix the doc names of the return value of swap_dims (#3329) Fix isel performance regression (#3319) Allow weakref (#3318) Clarify that "scatter" is a plotting method in what's new. (#3316) Fix whats-new date :/ Revert to dev version Release v0.13.0 auto_combine deprecation to 0.14 (#3314) Deprecation: groupby, resample default dim. (#3313) Raise error if cmap is list of colors (#3310) Refactor concat to use merge for non-concatenated variables (#3239) Honor `keep_attrs` in DataArray.quantile (#3305) Fix DataArray api doc (#3309) Accept int value in head, thin and tail (#3298) ignore h5py 2.10.0 warnings and fix invalid_netcdf warning test. (#3301) Update why-xarray.rst with clearer expression (#3307) Compat and encoding deprecation to 0.14 (#3294) ...

griverat added 3 commits September 9, 2019 15:45

Accept int value in head, thin and tail

ad7560e

Merge remote-tracking branch 'origin/master' into defvals-head-thin-tail

5d76cd6

Fix typing

6ca0b42

max-sixty approved these changes Sep 10, 2019

View reviewed changes

xarray/core/dataset.py Outdated Show resolved Hide resolved

xarray/core/dataset.py Outdated Show resolved Hide resolved

xarray/core/dataset.py Show resolved Hide resolved

shoyer reviewed Sep 10, 2019

View reviewed changes

Remove thin def val and add suggestions

d9cbcda

max-sixty reviewed Sep 12, 2019

View reviewed changes

shoyer reviewed Sep 12, 2019

View reviewed changes

shoyer approved these changes Sep 12, 2019

View reviewed changes

Fix typing and change raise message

4fec603

shoyer merged commit 7fb3b19 into pydata:master Sep 14, 2019

griverat deleted the defvals-head-thin-tail branch September 15, 2019 04:52

Uh oh!

Accept int value in head, thin and tail #3298

Accept int value in head, thin and tail #3298

Uh oh!

Conversation

griverat commented Sep 9, 2019

Uh oh!

max-sixty left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

max-sixty commented Sep 12, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shoyer left a comment

Choose a reason for hiding this comment

Uh oh!

griverat commented Sep 14, 2019

Uh oh!

shoyer commented Sep 14, 2019 via email

Uh oh!

max-sixty commented Sep 14, 2019

Uh oh!

griverat commented Sep 14, 2019

Uh oh!

shoyer commented Sep 14, 2019

Uh oh!

jhamman commented Sep 15, 2019

Uh oh!

Uh oh!