Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WithItems Support #1868

Merged
merged 19 commits into from
Aug 24, 2019
Merged

WithItems Support #1868

merged 19 commits into from
Aug 24, 2019

Conversation

kevinbache
Copy link
Contributor

@kevinbache kevinbache commented Aug 16, 2019

This PR adds for loop support to the KFP DSL.

Users instantiate loops like so:

from kfp import dsl

@dsl.pipeline(name='my-pipeline', description='A pipeline with a loop.')
def pipeline(my_pipe_param=10):
    loop_args = [{'a': 1, 'b': 2}, {'a': 10, 'b': 20}]
    with dsl.ParallelFor(loop_args) as item:
        op1 = dsl.ContainerOp(
            name="my-in-cop",
            image="library/bash:4.4.23",
            command=["sh", "-c"],
            arguments=["echo op1 %s %s" % (item.a, my_pipe_param)],
        )

They currently support multiple operations within the loop and nested operations. They don't currently support using the output of another operation as the input to loop.

This change is Reviewable

@kevinbache kevinbache changed the title W WithItems Support Aug 16, 2019
@kevinbache
Copy link
Contributor Author

/retest

@@ -677,6 +800,9 @@ def compile(self, pipeline_func, package_path, type_check=True):
yaml.Dumper.ignore_aliases = lambda *args : True
yaml_text = yaml.dump(workflow, default_flow_style=False)

if package_path is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the use cases where the YAML text is need?
The Compiler()._compile method returns the workflow dict which seems more useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was using it for visual debugging while i was developing. i figured why not leave the option for anyone else who ends up working on the compiler. what do you guys think?

@@ -591,6 +591,12 @@ def some_pipeline():
if container:
self.assertEqual(template['retryStrategy']['limit'], 5)

def test_withitem_basic(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to test the actual behavior of the feature instead of just comparing the YAML text?

It would be great if the tests are not starting to fail when some unrelated part (e.g. pipeline name or metadata) changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree, it'd be nice to have an e2e test as well, but either way, we'll want to include some unit tests too and this is the pattern the repo uses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests are important and we should have them. However we can usually do better than just comparing YAML output of the whole pipeline (although, checking for the loop compilation behavior might be non-trivial).
As and example of unit tests that test the feature behavior, see
test_init_container
test_op_transformers
test_set_display_name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, less brittle tests would be nice, though the full YAML comparison is more thorough and that is most of how we currently test the compiler.



# @dsl.pipeline(name='my-pipeline')
# def pipeline(my_pipe_param=10):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test failing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't be

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be commented out code here?

return op_name_to_op

def _fill_loop_args(self, new_root):
"""Traverses through graph, plucking up loop_args vars from ops groups and depositing pointers to them on the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this a bit more?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, this is a bit vague.

@Ark-kun
Copy link
Contributor

Ark-kun commented Aug 16, 2019

Thank you for this great work!
If this PR finished or still WIP?

@kevinbache
Copy link
Contributor Author

/retest

sdk/python/kfp/dsl/_for_loop.py Show resolved Hide resolved
sdk/python/kfp/dsl/_for_loop.py Outdated Show resolved Hide resolved
sdk/python/kfp/dsl/_for_loop.py Outdated Show resolved Hide resolved
sdk/python/kfp/dsl/_for_loop.py Outdated Show resolved Hide resolved
sdk/python/kfp/dsl/_for_loop.py Show resolved Hide resolved
sdk/python/kfp/dsl/_ops_group.py Outdated Show resolved Hide resolved
sdk/python/kfp/compiler/compiler.py Outdated Show resolved Hide resolved
sdk/python/kfp/compiler/compiler.py Outdated Show resolved Hide resolved
sdk/python/kfp/compiler/compiler.py Outdated Show resolved Hide resolved
sdk/python/kfp/compiler/compiler.py Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the lgtm label Aug 20, 2019
@kevinbache
Copy link
Contributor Author

/ping

@hongye-sun hongye-sun self-assigned this Aug 21, 2019
Copy link
Contributor

@hongye-sun hongye-sun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a basic example for this feature? It doesn't have to be in the same PR.

)


# @dsl.pipeline(name='my-pipeline')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

@hongye-sun
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot removed the lgtm label Aug 23, 2019
@kevinbache
Copy link
Contributor Author

kevinbache commented Aug 23, 2019

/assign @IronPan

@Ark-kun
Copy link
Contributor

Ark-kun commented Aug 23, 2019

/lgtm
/approve

@Ark-kun
Copy link
Contributor

Ark-kun commented Aug 23, 2019

git checkout origin/master .gitignore

@kevinbache
Copy link
Contributor Author

/assign @neuromage

@k8s-ci-robot k8s-ci-robot removed the lgtm label Aug 24, 2019
@neuromage
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun, neuromage

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 96fd193 into kubeflow:master Aug 24, 2019
@kevinbache
Copy link
Contributor Author

closes #1481

@gaoning777
Copy link
Contributor

This is great. BTW, have you tested the withitem support with the recursion. For example, create a ParallelFor loop inside a recursive function? AFAIK, this is a common case where the outer recursion controls when to stop a HP running and the inner ParallelFor will run some parameters in parallel.

magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this pull request Oct 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants