Skip to content

plot.line(): Draw multiple lines for 2D DataArrays. #1785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 31, 2017

Conversation

dcherian
Copy link
Contributor

@dcherian dcherian commented Dec 15, 2017

  • Tests added (for all bug fixes or enhancements)
  • Tests passed (for all non-documentation changes)
  • Passes git diff upstream/master **/*py | flake8 --diff (remove if you did not edit any Python files)
  • Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

Adds support for plotting multiple lines if plot.line() is provided with a 2D dataarray.

kwarg x lets you specify co-ordinate for x-axis.

Example:

xarray-multiple-line

@dcherian dcherian force-pushed the 2d-array-line-plot branch 3 times, most recently from a9b5d76 to 4572fd4 Compare December 15, 2017 19:54
@dcherian
Copy link
Contributor Author

Now adds a legend too. Can be turned off with add_legend=False

xarray-multiple-line

@fmaussion
Copy link
Member

Looks quite good! Thanks.

Sorry to be annoying ;), but there should be tests for the features you document:

  • x : string, optional
  •    Coordinate for x axis (2D inputs only). If None use darray.dims[1]
    
  • add_legend : boolean, optional
  •    Add legend with y axis coordinates (2D inputs only).
    

Currently, choosing another dim for the x axis and the legend kwargs are not tested.

@dcherian
Copy link
Contributor Author

@fmaussion I've added more tests. Let me know if you'd like more changes.

Copy link
Member

@fmaussion fmaussion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks!

@shoyer
Copy link
Member

shoyer commented Dec 19, 2017

Is there a keyword argument name that it would make sense to use for the dimension that is repeated into multiple lines? Maybe hue, as is used by Seaborn?

I would like there to be a fully explicit way to make these plots.

@dcherian
Copy link
Contributor Author

@fmaussion I've added one more commit. If x kwarg is not provided, we automatically choose the longer dimension. So if you have a 10000x3 DataArray, this will plot 3 lines instead of 10000.

Extended an existing test

@dcherian
Copy link
Contributor Author

@shoyer hue sounds perfect. I'll work on adding that.

For arrays shaped as (100000, 2), this will plot 2 lines instead of 100000!
@shoyer
Copy link
Member

shoyer commented Dec 19, 2017

If x kwarg is not provided, we automatically choose the longer dimension. So if you have a 10000x3 DataArray, this will plot 3 lines instead of 10000.

I am very nervous about automated heuristics for choosing behavior. I would much rather we raise an error message in cases like this, rather than guessing. (The problem is that heuristics can make it very hard to predict/understand how code will work without trying it.)

@dcherian
Copy link
Contributor Author

I've added support for a hue kwarg in the latest commit. Let me know what you think.

I am very nervous about automated heuristics for choosing behavior. I would much rather we raise an error message in cases like this, rather than guessing. (The problem is that heuristics can make it very hard to predict/understand how code will work without trying it.)

Well, this behaviour is analogous to automatically choosing x,y in _plot2d. xarray is already pretty opinionated in this respect.

The current behaviour for a 10000x3 array is to plot 10000 lines which is bad. I'm OK with adding an error message but strongly feel that choosing to plot 3 lines (i.e. always smallest number of lines) is a good default.

Re:error, would that be a message stating that either x or hue be specified if the input is 2D?

@shoyer
Copy link
Member

shoyer commented Dec 19, 2017

Well, this behaviour is analogous to automatically choosing x,y in _plot2d. xarray is already pretty opinionated in this respect.

We use the order of the dimensions on the array for choosing how to plot it. The analogous behavior would be to always plot longer dimension along the x-axis, which isn't what we do.

The current behaviour for a 10000x3 array is to plot 10000 lines which is bad. I'm OK with adding an error message but strongly feel that choosing to plot 3 lines (i.e. always smallest number of lines) is a good default.

I agree that users probably rarely want 10,000 lines :). That's a good reason to require an explicit choice here.

The problem are edge cases like a 5x6 array. Do you want 5 lines of 6 points each or 6 lines of 5 points each? If we make the heuristic depend on the size of the array, then it will give very hard to understand what happens when the array shape changes slightly.

Re:error, would that be a message stating that either x or hue be specified if the input is 2D?

Yes, that sounds right.

Added an error if provided 'x' is not a dimension for 1D input (to
mirror _infer_xy_labels behaviour for 2D inputs)

Updated tests.
@dcherian
Copy link
Contributor Author

@shoyer , @fmaussion : Updated to require either x or hue kwarg if input is 2D.

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor point -- otherwise looks good to me, thanks!

if ndims == 1:
xlabel, = darray.dims
if x is not None and xlabel != x:
raise ValueError('Input does not have specified dimension ' + x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use string formatting instead so this still works even if the dimensions is not a string, e.g.,

raise ValueError('Input does not have specified dimension {!r}'.format(x))

(Using non-strings as dimension names is rare but we still try to support it when feasible.)

@dcherian
Copy link
Contributor Author

@shoyer Used your suggested change.

@@ -208,7 +208,8 @@ def line(darray, *args, **kwargs):
if ndims == 1:
xlabel, = darray.dims
if x is not None and xlabel != x:
raise ValueError('Input does not have specified dimension ' + x)
raise ValueError('Input does not have specified dimension'
+ ' {!r}'.format(x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for future reference: you can skip the + character and rely on implicit string concatenation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't know about that. Thanks.

@shoyer shoyer merged commit c368ee7 into pydata:master Dec 31, 2017
@shoyer
Copy link
Member

shoyer commented Dec 31, 2017

Thanks!

@shoyer
Copy link
Member

shoyer commented Dec 31, 2017

I merged this, but it might also be nice to add an example (or at least brief mention) to the narrative docs for 1D plotting. See:
http://xarray.pydata.org/en/stable/plotting.html#one-dimension
https://github.com/pydata/xarray/blob/master/doc/plotting.rst

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants