-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/1120 #1538
Fix/1120 #1538
Conversation
…oords during dataset construction
failures here appear to be related to dask distributed. |
xarray/core/merge.py
Outdated
@@ -365,6 +365,21 @@ def merge_data_and_coords(data, coords, compat='broadcast_equals', | |||
return merge_core(objs, compat, join, explicit_coords=explicit_coords) | |||
|
|||
|
|||
def assert_valid_explicit_coords(variables, explicit_coords): | |||
'''raise a MergeError if an explicit coord shares a name with a dimension | |||
but is comprised of arbitrary dimensions''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you care: pep8 is """
& on their own lines
xarray/core/merge.py
Outdated
for name, var in variables.items(): | ||
if name not in explicit_coords: | ||
var_dims.extend(var.dims) | ||
for coord_name in explicit_coords: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're missing this as a function argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicit_coords
is a function argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was misreading something else, nevermind
xarray/core/merge.py
Outdated
var_dims.extend(var.dims) | ||
for coord_name in explicit_coords: | ||
if coord_name in var_dims and not all( | ||
[d in var_dims for d in variables[coord_name].dims]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than not all(...)
, I think this condition should be just variables[coord_name].dims != (coord_name,)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this works.
xarray/core/merge.py
Outdated
Raise a MergeError if an explicit coord shares a name with a dimension | ||
but is comprised of arbitrary dimensions. | ||
""" | ||
var_dims = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a set here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow. Are you just suggesting we cast var_dims
to a set after populating it or are you looking for some different logic for determining the dimensions in the non-explicit coord variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just that a set()
is a better data structure for contains lookups than a list.
But actually, you should use the result of calculate_dimensions()
here, instead of calculating dimensions twice.
@shoyer - thanks for the review. I iterated on this a few times and landed on something that was more complex than necessary. Your suggestions have been incorporated. |
This is ready for a final review. Tests are passing now. |
xarray/core/merge.py
Outdated
if coord_name in dims and variables[coord_name].dims != (coord_name,): | ||
raise MergeError( | ||
'coordinate %s shares a name with a dimension but ' | ||
'includes at least one arbitrary dimensions' % coord_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"includes at least one arbitrary dimensions" is a little confusing to me.
Let's try to be totally explicit here, even if the error message needs to be longer. Maybe:
coordinate X shares a name with a dataset dimension, but is not a 1D variable along that dimension. This is disallowed by the xarray data model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
Dataset
with a coordinate given by aDataArray
may create an invalid dataset #1120git diff upstream/master | flake8 --diff
whats-new.rst
for all changes andapi.rst
for new API