Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/1120 #1538

Merged
merged 8 commits into from
Sep 6, 2017
Merged

Fix/1120 #1538

merged 8 commits into from
Sep 6, 2017

Conversation

jhamman
Copy link
Member

@jhamman jhamman commented Aug 30, 2017

@jhamman
Copy link
Member Author

jhamman commented Aug 30, 2017

failures here appear to be related to dask distributed.

@@ -365,6 +365,21 @@ def merge_data_and_coords(data, coords, compat='broadcast_equals',
return merge_core(objs, compat, join, explicit_coords=explicit_coords)


def assert_valid_explicit_coords(variables, explicit_coords):
'''raise a MergeError if an explicit coord shares a name with a dimension
but is comprised of arbitrary dimensions'''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you care: pep8 is """ & on their own lines

for name, var in variables.items():
if name not in explicit_coords:
var_dims.extend(var.dims)
for coord_name in explicit_coords:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're missing this as a function argument

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicit_coords is a function argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was misreading something else, nevermind

var_dims.extend(var.dims)
for coord_name in explicit_coords:
if coord_name in var_dims and not all(
[d in var_dims for d in variables[coord_name].dims]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than not all(...), I think this condition should be just variables[coord_name].dims != (coord_name,)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this works.

Raise a MergeError if an explicit coord shares a name with a dimension
but is comprised of arbitrary dimensions.
"""
var_dims = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a set here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow. Are you just suggesting we cast var_dims to a set after populating it or are you looking for some different logic for determining the dimensions in the non-explicit coord variables?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just that a set() is a better data structure for contains lookups than a list.

But actually, you should use the result of calculate_dimensions() here, instead of calculating dimensions twice.

@jhamman
Copy link
Member Author

jhamman commented Aug 31, 2017

@shoyer - thanks for the review. I iterated on this a few times and landed on something that was more complex than necessary. Your suggestions have been incorporated.

@jhamman jhamman mentioned this pull request Aug 31, 2017
13 tasks
@jhamman
Copy link
Member Author

jhamman commented Sep 5, 2017

This is ready for a final review. Tests are passing now.

@jhamman jhamman modified the milestone: 0.10 Sep 5, 2017
if coord_name in dims and variables[coord_name].dims != (coord_name,):
raise MergeError(
'coordinate %s shares a name with a dimension but '
'includes at least one arbitrary dimensions' % coord_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"includes at least one arbitrary dimensions" is a little confusing to me.

Let's try to be totally explicit here, even if the error message needs to be longer. Maybe:

coordinate X shares a name with a dataset dimension, but is not a 1D variable along that dimension. This is disallowed by the xarray data model.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@jhamman jhamman merged commit 216bb67 into pydata:master Sep 6, 2017
@jhamman jhamman deleted the fix/1120 branch September 6, 2017 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants