Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor improvement of docstring for Dataset #2904

Merged
merged 1 commit into from
Apr 17, 2019

Conversation

lumbric
Copy link
Contributor

@lumbric lumbric commented Apr 17, 2019

This might help to avoid confusion. data_vars is always a mapping, not a
mapping, a variable or a tuple.

Passing just a tuple, does not work of course. But for xarray newbies, this might be less obvious and the error message is also not easy to interpret:

>>> xr.Dataset(('dim1', np.ones(5)))
...
TypeError: unhashable type: 'numpy.ndarray'

The correct version of the example above should be:

>>> xr.Dataset({'myvar': ('dim1', np.ones(5))})                                                                            
<xarray.Dataset>
Dimensions:  (dim1: 5)
Dimensions without coordinates: dim1
Data variables:
    myvar    (dim1) float64 1.0 1.0 1.0 1.0 1.0

This might help to avoid confusion. data_vars is always a mapping, not a
mapping, a variable or a tuple.
@max-sixty
Copy link
Collaborator

These are great! Agree on the sentence construction.

Any others are welcome, and please feel free to add an entry & attribution in the whats-new at some point

@max-sixty
Copy link
Collaborator

max-sixty commented Apr 17, 2019

(agree on the error message, I think we could do a better job there. We already made some minor changes around sets but we could make broader checks for these initial entry points that new users first encounter)

@lumbric
Copy link
Contributor Author

lumbric commented Apr 17, 2019

Hm yes, good error messages would be great, but I feel like it is widely accepted that in the scientific Python ecosystem error messages are hard to read quite often. Maybe this is the downside the duck typing? I've mentioned this only as explanation why I was so confused after running xr.Dataset for the first time.

@max-sixty
Copy link
Collaborator

I feel like it is widely accepted that in the scientific Python ecosystem error messages are hard to read quite often. Maybe this is the downside the duck typing?

We can make it better!
While duck typing does allow passing incorrectly typed arguments, it doesn't prevent us from checking their types. Python now has types like Mapping or Sized which can be used more like protocols and so don't give up flexibility.
So I think it would be strictly dominant for us to check for an instance of Mapping there, and raise a readable error message if it fails.

@lumbric
Copy link
Contributor Author

lumbric commented Apr 17, 2019

Ah yes, true! I've confused something here. dict() accepts mappings, but not everything dict() accepts is a mapping. xr.Dataset() actually accepts only mappings. That makes actually things a bit easier and much clearer.

@shoyer shoyer merged commit aebe60c into pydata:master Apr 17, 2019
@shoyer
Copy link
Member

shoyer commented Apr 17, 2019

Thanks @lumbric !

Feel free to work on improved error messages in another PR :)

dcherian added a commit to yohai/xarray that referenced this pull request Apr 19, 2019
* master: (29 commits)
  Handle the character array dim name  (pydata#2896)
  Partial fix for pydata#2841 to improve formatting. (pydata#2906)
  docs: Move quick overview one level up (pydata#2890)
  Manually specify chunks in open_zarr (pydata#2530)
  Minor improvement of docstring for Dataset (pydata#2904)
  Fix minor typos in docstrings (pydata#2903)
  Added docs example for `xarray.Dataset.get()` (pydata#2894)
  Bugfix for docs build instructions (pydata#2897)
  Return correct count for scalar datetime64 arrays (pydata#2892)
  Indexing with an empty array (pydata#2883)
  BUG: Fix pydata#2864 by adding the missing vrt parameters (pydata#2865)
  Reduce length of cftime resample tests (pydata#2879)
  WIP: type annotations (pydata#2877)
  decreased pytest verbosity (pydata#2881)
  Fix mypy typing error in cftime_offsets.py (pydata#2878)
  update links to https (pydata#2872)
  revert to 0.12.2 dev
  0.12.1 release
  Various fixes for explicit Dataset.indexes (pydata#2858)
  Fix minor typo in docstring (pydata#2860)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants