Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite KFP code generation #2993

Merged
merged 30 commits into from
Nov 11, 2022
Merged

Conversation

ptitzler
Copy link
Member

@ptitzler ptitzler commented Oct 28, 2022

This PR:

  • rewrites code generation for Kubeflow Pipelines v1 (v2 is still not supported)
  • adds a new (previously removed) export option to the pipeline editor, enabling output of Python DSL in addition to the already supported YAML output (YAML remains the pre-selected option)
    image
  • adds a new optional format parameter to the elyra-pipeline export CLI command:
    • if the parameter is not specified, the export format defaults to YAML for Kubeflow Pipelines and PY for Apache Airflow
    • if the parameter is specified, it must be one of YAML / PY (Kubeflow Pipelines) or PY (Airflow)
    • The value is processed in a case insensitive manner, e.g. YAML = yAML = yaml
    $ elyra-pipeline export one-custom-node.pipeline --runtime-config cloning1 --format py
    
  • Fixes an existing pipeline export bug (image pull secret information is not exported, rendering the pipeline unusable)

Closes #2986

Follow-up for tests: #3002

What changes were proposed in this pull request?

The Kubeflow Pipelines processor now always generates Python DSL code as intermediary output when a pipeline is submitted or exported:

  • Submit: Internal Elyra pipeline representation -> Python DSL (generated by processor) -> YAML (generated by KFP argo/tekton compiler)
  • Export: Internal Elyra pipeline representation -> Python DSL (generated by processor)
  • Export: Internal Elyra pipeline representation -> Python DSL (generated by processor) -> YAML (generated by KFP argo/tekton compiler)
  • The pipeline documentation was updated to list the new output format.

How was this pull request tested?

  • Updated existing CLI tests
  • Added new server tests that validate that code generation yields the expected results. There are now dedicated tests for all code generation aspects:
    • generates expected code for the configured workflow engine
    • generates expected code for CRIO environments
    • generates expected code for a plain generic component
    • generates expected code for a plain generic component that utilizes a runtime image, which is protected by a pull secret
    • generates expected code for a generic component configured with all supported elyra-owned properties
    • generates expected code for a plain custom component
    • generates expected code for data exchange between generic components
  • Reviewed the output of make docs

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

@ptitzler ptitzler added the kind:enhancement New feature or request label Oct 28, 2022
@ptitzler ptitzler added this to the 3.13.0 milestone Oct 28, 2022
@elyra-bot
Copy link

elyra-bot bot commented Oct 28, 2022

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link: Binder

@ptitzler ptitzler marked this pull request as draft October 28, 2022 22:24
@ptitzler ptitzler added component:pipeline-runtime issues related to pipeline runtimes e.g. kubeflow pipelines platform: pipeline-Kubeflow Related to usage of Kubeflow Pipelines as pipeline runtime labels Oct 28, 2022
"""
# Load Kubeflow Pipelines Python DSL template
loader = PackageLoader("elyra", "templates/kubeflow")
template_env = Environment(loader=loader)

Check warning

Code scanning / CodeQL

Jinja2 templating with autoescape=False

Using jinja2 templates with autoescape=False can potentially allow XSS attacks.
elyra/pipeline/kfp/processor_kfp.py Fixed Show fixed Hide fixed
elyra/pipeline/kfp/processor_kfp.py Fixed Show fixed Hide fixed
@ptitzler ptitzler added the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label Oct 28, 2022
Comment on lines +636 to +638
generic_component_template = Environment(
loader=PackageLoader("elyra", "templates/kubeflow/v1")
).get_template("generic_component_definition_template.jinja2")

Check warning

Code scanning / CodeQL

Jinja2 templating with autoescape=False

Using jinja2 templates with autoescape=False can potentially allow XSS attacks.
@ptitzler ptitzler marked this pull request as ready for review November 2, 2022 19:13
@ptitzler ptitzler requested a review from akchinSTC November 7, 2022 23:27
@ptitzler ptitzler removed the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label Nov 8, 2022
Copy link
Member

@akchinSTC akchinSTC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nits, CRIO options look right and exported configs (volumes and modified bootstrapper options) look good.

docs/source/user_guide/pipelines.md Outdated Show resolved Hide resolved
elyra/pipeline/kfp/processor_kfp.py Show resolved Hide resolved
ptitzler and others added 2 commits November 9, 2022 07:00
@ptitzler
Copy link
Member Author

ptitzler commented Nov 9, 2022

Based on a discussion we've had during today's dev meeting, I've updated code generation to "unload" the generated Python DSL module after it was compiled. This change should address the potential concern that over time module artifacts might accumulate, which could lead to increased memory consumption over long periods of time.

Copy link
Member

@kiersten-stokes kiersten-stokes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow this is awesome! Tested a ton of scenarios and all are turning out as expected 🎉

elyra/templates/kubeflow/v1/python_dsl_template.jinja2 Outdated Show resolved Hide resolved
elyra/pipeline/kfp/processor_kfp.py Outdated Show resolved Hide resolved
@ptitzler
Copy link
Member Author

To address offline review feedback I've updated the Python DSL template to render a comment that identifies the node name:

image

    # Task for node 'Download File'
    task_8ee5ee17_1222_434e_bcf4_fd12f43e2510 = factory_8e4384f422a088e4814024df7955e952c1488bd091fa0d4873d5f611d741ceb4(
        url="https://raw.gith...",
    )

This is similar to what is done for Apache Airflow

Signed-off-by: Patrick Titzler <ptitzler@us.ibm.com>
@akchinSTC akchinSTC merged commit e682ef4 into elyra-ai:main Nov 11, 2022
@ptitzler ptitzler mentioned this pull request Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:pipeline-runtime issues related to pipeline runtimes e.g. kubeflow pipelines kind:enhancement New feature or request platform: pipeline-Kubeflow Related to usage of Kubeflow Pipelines as pipeline runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor code generation for Kubeflow Pipelines
3 participants