Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure HighLevelGraph layers are Layer instances #5271

Merged
merged 2 commits into from
May 6, 2021

Conversation

jrbourbeau
Copy link
Contributor

This PR updates how we construct the HighLevelGraph in subset_dataset_to_block. Currently we create a HighLevelGraph and then manually add a few new layers, which are dicts, directly to hlg.layers.

hlg = HighLevelGraph.from_collections(
gname,
graph,
dependencies=[arg for arg in npargs if dask.is_dask_collection(arg)],
)
for gname_l, layer in new_layers.items():
# This adds in the getitems for each variable in the dataset.
hlg.dependencies[gname_l] = {gname}
hlg.layers[gname_l] = layer

However since (in more recent versions of Dask) HighLevelGraph layers are expected to be Layer class instances this can result in unexpected errors (xref #5077 (comment)). Instead of manually adding new layers to hlg.layers, this PR proposes we create a new HighLevelGraph altogether to ensure that hlg.layers won't contain any raw dict layers.

Copy link
Member

@andersy005 andersy005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing this issue, @jrbourbeau!

It appears that the failing tests are unrelated to your changes

@dcherian
Copy link
Contributor

dcherian commented May 6, 2021

Great thanks @jrbourbeau

@dcherian dcherian merged commit 020345d into pydata:master May 6, 2021
@jrbourbeau jrbourbeau deleted the hlg-fixup branch May 6, 2021 20:37
@dcherian dcherian mentioned this pull request May 13, 2021
9 tasks
dcherian added a commit to TomNicholas/xarray that referenced this pull request May 13, 2021
* upstream/master:
  combine keep_attrs and combine_attrs in apply_ufunc (pydata#5041)
  Explained what a deprecation cycle is (pydata#5289)
  Code cleanup (pydata#5234)
  FacetGrid docstrings (pydata#5293)
  Add whats new for dataset interpolation with non-numerics (pydata#5297)
  Allow dataset interpolation with different datatypes (pydata#5008)
  Flexible indexes: add Index base class and xindexes properties (pydata#5102)
  pre-commit: autoupdate hook versions (pydata#5280)
  convert the examples for apply_ufunc to doctest (pydata#5279)
  fix the new whatsnew section
  Ensure `HighLevelGraph` layers are `Layer` instances (pydata#5271)
dcherian added a commit to matzegoebel/xarray that referenced this pull request May 13, 2021
* upstream/master: (23 commits)
  combine keep_attrs and combine_attrs in apply_ufunc (pydata#5041)
  Explained what a deprecation cycle is (pydata#5289)
  Code cleanup (pydata#5234)
  FacetGrid docstrings (pydata#5293)
  Add whats new for dataset interpolation with non-numerics (pydata#5297)
  Allow dataset interpolation with different datatypes (pydata#5008)
  Flexible indexes: add Index base class and xindexes properties (pydata#5102)
  pre-commit: autoupdate hook versions (pydata#5280)
  convert the examples for apply_ufunc to doctest (pydata#5279)
  fix the new whatsnew section
  Ensure `HighLevelGraph` layers are `Layer` instances (pydata#5271)
  New whatsnew section
  Release-workflow: Bug fix (pydata#5273)
  more maintenance on whats-new.rst (pydata#5272)
  v0.18.0 release highlights (pydata#5266)
  Fix exception when display_expand_data=False for file-backed array. (pydata#5235)
  Warn ignored keep attrs (pydata#5265)
  Disable workflows on forks (pydata#5267)
  fix the built wheel test (pydata#5270)
  pypi upload workflow maintenance (pydata#5269)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants