Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A different approach to Dynamic Pipelines #2663

Closed
desmond-dsouza opened this issue Jun 8, 2023 · 4 comments
Closed

A different approach to Dynamic Pipelines #2663

desmond-dsouza opened this issue Jun 8, 2023 · 4 comments
Labels
Issue: Feature Request New feature or improvement to existing feature

Comments

@desmond-dsouza
Copy link

Description

I will quote from this blog post on Haskell's Shake build system

The most important thing Shake got right was adding monadic/dynamic dependencies. Most build systems start with a static graph, and then, realizing that can't express the real world, start hacking in an unprincipled manner. The resulting system becomes a bunch of special cases. Shake embraced dynamic dependencies. That makes some things harder (no static cycle detection, less obvious parallelism, must store dependency edges), but all those make Shake itself harder to write, while dynamic dependencies make Shake easier to use. I hope that eventually all build systems gain dynamic dependencies.

Context

Current discussions on dynamic pipelines in Kedro. A more principled approach may end up being both more flexible, understandable, and usable.

Shake has been used, among other things, for a BioInformatics pipeline management tool called BioShake. That domain has a lot in common with Kedro's domain.

Possible Implementation

Emulate Shake in Python, it has more than enough dynamism. Would need to accommodate today's static configuration styles, which should be doable.

Possible Alternatives

@desmond-dsouza desmond-dsouza added the Issue: Feature Request New feature or improvement to existing feature label Jun 8, 2023
@noklam
Copy link
Contributor

noklam commented Jun 8, 2023

Thanks for creating this discussion. This is very interesting, I am not familiar with Haskell or Shake.

Could you describe how it would work for kedro to embrace it and how Shake solve this problem differently?

@desmond-dsouza
Copy link
Author

desmond-dsouza commented Jun 9, 2023

I'm new to kedro and just happened to hit this limitation early. Short reply here, will try to expand on this later.

The key is it uses a DSL (embedded in Haskell) to write the dependencies in code. In particular, some dependencies may be static (of course), but others can be determined after examining interim results of previous dependencies.

With apologies for posting links for now: paper including skeletal code here.
https://ndmitchell.com/downloads/paper-shake_before_building-10_sep_2012.pdf

The code is fairly readable as most of it maps to Python @dataclasses and functions (with generics in signatures, and no parenthesis needed to call a function). The $ is just function application, and do notation is syntax sugar for implicitly threading the result of one step into the next step. [EDIT: the do is needed because Haskell is pure, can mostly ignore it for Python]

short talk (15 min) here: https://www.youtube.com/watch?v=xYCPpXVlqFM

@stichbury
Copy link
Contributor

Just adding #2627 pointer here so the two issues are linked

@astrojuanlu
Copy link
Member

Example of said DSL:

image

So in essence, the idea would be to declare (in our case in Python) that the dependencies of one node would be dynamically generated from another node.

This is a potential solution of a subset of the problems expressed in #2627. I'd say let's continue the discussion there.

@astrojuanlu astrojuanlu closed this as not planned Won't fix, can't repro, duplicate, stale Nov 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Feature Request New feature or improvement to existing feature
Projects
None yet
Development

No branches or pull requests

4 participants