Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[patch] Relocate pyiron_workflow.job #1126

Draft
wants to merge 35 commits into
base: main
Choose a base branch
from
Draft

[patch] Relocate pyiron_workflow.job #1126

wants to merge 35 commits into from

Conversation

liamhuber
Copy link
Member

Move the module here using git copy and update dependencies. Note that this is waiting on a release of pyiron_workflow that doesn't exist yet -- depending on existing releases would make a cyclic dependency.

liamhuber and others added 30 commits January 31, 2024 17:07
Saving now requires that the saved nodes be importable, which would be the case if you copied and pasted the example code into a notebook, but is not the case in the dynamic place doctest is running things. So just don't define a custom "Sleep" node here, instead use a standard node.
It was just not showing up because the test name was doubled up
And test for it, hopefully, we'll find out on the CI I guess
Make NodeJob compliant with the storage interface
To clear room for a simpler NodeJob that _doesn't_ lean on pyiron_workflow's storage implementation
Rename NodeJob to StoredNodeJob
It's just a less-good version of the `NodeOutputJob`
I suspect what's happening is `DataContainer` is leaning on `h5io_browser`'s `use_state` defaults, but I won't dig into it now.
[breaking] Replace the wrapper job with a more robust job subclass
In the process of getting nicer phrasing
* Change paradigm to whether or not the node uses __reduced__ and a constructor

Instead of "Meta" nodes

* Allow direct use of Constructed children

* Move and update constructed stuff

* Add new singleton behaviour so factory-produced classes can pass is-tests

* PEP8 newline

* Remove unnecessary __getstate__

The object isn't holding instance level state and older versions of python bork here.

* Add constructed __*state__ compatibility for older versions

* 🐛 add missing `return`

* Format black

* Revert singleton

* Remove constructed

It's superceded by the snippets.factory stuff

* Format black

* Let the factory clear method take specific names

* Don't override __module__ to the factory function

If it was explicitly set downstream, leave that. But if the user left it empty, still default it back to the factory function's module

* Clean up storage if job tests fail

* Make tinybase the default storage backend

* Switch Function and Macro over to using classfactory

With this, everything is pickleable (unless you slap something unpickleable on top, or define it in a place that can't be reached by pickle like inside a local function scope). The big downside is that `h5io` storage is now basically useless, since all our nodes come from custom reconstructors. Similarly, for the node job `DataContainer` can no longer store the input node. The `tinybase` backend is still working ok, so I made it the default, and I got the node job working again by forcing it to cloudpickle the input node on saving. These are some ugly hacks, but since storage is an alpha feature right now anyhow, I'd prefer to push ahead with pickleability.

* Remove unused decorator

And reformat tests in the vein of usage in Function and Macro

* Format black

---------

Co-authored-by: pyiron-runner <pyiron@mpie.de>
* Change paradigm to whether or not the node uses __reduced__ and a constructor

Instead of "Meta" nodes

* Allow direct use of Constructed children

* Move and update constructed stuff

* Add new singleton behaviour so factory-produced classes can pass is-tests

* PEP8 newline

* Remove unnecessary __getstate__

The object isn't holding instance level state and older versions of python bork here.

* Add constructed __*state__ compatibility for older versions

* 🐛 add missing `return`

* Format black

* Revert singleton

* Remove constructed

It's superceded by the snippets.factory stuff

* Format black

* Let the factory clear method take specific names

* Don't override __module__ to the factory function

If it was explicitly set downstream, leave that. But if the user left it empty, still default it back to the factory function's module

* Clean up storage if job tests fail

* Make tinybase the default storage backend

* Switch Function and Macro over to using classfactory

With this, everything is pickleable (unless you slap something unpickleable on top, or define it in a place that can't be reached by pickle like inside a local function scope). The big downside is that `h5io` storage is now basically useless, since all our nodes come from custom reconstructors. Similarly, for the node job `DataContainer` can no longer store the input node. The `tinybase` backend is still working ok, so I made it the default, and I got the node job working again by forcing it to cloudpickle the input node on saving. These are some ugly hacks, but since storage is an alpha feature right now anyhow, I'd prefer to push ahead with pickleability.

* Remove unused decorator

And reformat tests in the vein of usage in Function and Macro

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Compute qualname if not provided

* Fail early if there is a <locals> function in the factory made hierarchy

* Skip the factory fanciness if you see <locals>

This enables _FactoryMade objects to be cloudpickled, even when they can't be pickled, while still not letting the mere fact that they are dynamic classes stand in the way of pickling.

Nicely lifts our constraint on the node job interaction with pyiron base, which was leveraging cloudpickle

* Format black

* Test ClassFactory this way too

---------

Co-authored-by: pyiron-runner <pyiron@mpie.de>
Except pyiron_base, which is already in the main dependencies. This include a reference to a version of pyiron_workflow that doesn't exist yet -- we can't rely on an existing version because it would give a cyclic dependency error.
Copy link
Contributor

github-actions bot commented Aug 8, 2024

Binder 👈 Launch a binder notebook on branch pyiron/pyiron_contrib/git-copy-target

@liamhuber
Copy link
Member Author

pyiron_workflow and pyiron_base conflict over pyiron_snippets:

  Could not solve for environment specs
  The following packages are incompatible
  ├─ pyiron_base 0.9.11**  is installable and it requires
  │  └─ pyiron_snippets 0.1.3 , which can be installed;
  └─ pyiron_workflow 0.10.0**  is not installable because there are no viable options
     ├─ pyiron_workflow 0.10.0 would require
     │  └─ pyiron_snippets 0.1.4 , which conflicts with any installable versions previously reported;
     └─ pyiron_workflow 0.10.0 would require
        └─ pyiron_base 0.9.12 , which conflicts with any installable versions previously reported.

I guess I'll go try bumping base

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants