Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline Editor: Extend data volume node property #2961

Merged
merged 23 commits into from
Oct 13, 2022

Conversation

ptitzler
Copy link
Member

@ptitzler ptitzler commented Oct 8, 2022

Requires #2957 RESOLVED

This PR enables users to optionally customize the read_only and sub_path attributes of the data volume property.

image

The new attributes are supported for the Kubeflow Pipelines and Apache Airflow runtimes and apply to generic and custom components.

image

Benefits

The main benefit of this enhancement is more fine granular access control for mounted volumes, as illustrated in the following scenarios:

  • Scenario 1: a Kubernetes PVC is configured for read access (ROX)
  • Scenario 2: a Kubernetes PVC is configured for read/write access (RWO, RWX)

In both scenarios the underlying filesystem contains the same directories, each including several files.

/
/inputs/...
/outputs/...
/other/...

Pre-PR behavior

  • Scenario 1: pipeline nodes have read access to the entire filesystem (/, /inputs/..., etc)
  • Scenario 2: pipeline nodes have read/write access to the entire filesystem (/, /inputs/..., etc)

This implies that the entire filesystem is unprotected, because every single file can be accessed, and in the case of scenario 2 manipulated.

post-PR behavior

  • Scenario 1: Read access to the entire filesystem (/, /inputs/..., etc) or parts of the filesystem (only /inputs/...) can be granted, as appropriate
  • Scenario 2: Read/write access to the entire filesystem (/, /inputs/..., etc) or parts of the filesystem (read access for /inputs/..., write access to /outputs/...) can be granted, as appropriate

Note that if the PVC's access mode is read-only, setting the read_only attribute to False does not grant write access.

Closes #2953

What changes were proposed in this pull request?

See above

How was this pull request tested?

  • Reviewed the output of make docs
  • Updated server tests
  • Manual testing of the scenarios listed above

Correct behavior can be confirmed by inspecting the node's pod (e.g. kubectl describe pod -n kubeflow-user-example-com <pod-id>)

    ...
    Mounts:
      /mnt/vol1 from test-pvc (rw,path="sub-dir-1")
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-editor-token-jf5kd (ro)
    ...
Volumes:
  test-pvc:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  test-pvc
    ReadOnly:   false

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

@ptitzler ptitzler added kind:enhancement New feature or request component:pipeline-editor pipeline editor labels Oct 8, 2022
@ptitzler ptitzler added this to the 3.13.0 milestone Oct 8, 2022
@elyra-bot
Copy link

elyra-bot bot commented Oct 8, 2022

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link: Binder

@ptitzler
Copy link
Member Author

@marthacryan @ajbozarth can we render boolean default node values as checkboxes, like we do for node properties?

image

@ptitzler
Copy link
Member Author

ptitzler commented Oct 10, 2022

I just realized that using a checkbox as input widget for boolean values causes a usability issue.

Illustrating this using this PR scenario as an example. Assume the pipeline default value for a boolean property attribute (read-only) is set to True. (The internal default is False.) When this property is rendered in the node properties view the property is implicitly set to the internal default (False), overriding the explicitly specified default value True. The user therefore must also explicitly set the node property value to achieve the desired behavior of setting read-only to True.

Edit: For this PR/scenario the current behavior is the desired one, because the pipeline default value is the combination of all attributes (mount path, pvc name, sub path, read-only). If we were to treat each attribute as a default things would get confusing very quickly.

However, the issue should apply to boolean properties:

  • boolean pipeline default property ABC's value is set to True
  • node value rendering of True/False would always override the default because 'not-set' cannot be modeled

@kevin-bates
Copy link
Member

I just realized that using a checkbox as input widget for boolean values causes a usability issue.

This seems more of a matter of using the wrong default value in the schema than a usability issue relative to booleans and checkboxes.

@kevin-bates
Copy link
Member

(Our comments crossed.)

node value rendering of True/False would always override the default because 'not-set' cannot be modeled

Yes. This is where I thought your original comment was heading and, if "not-set" is a third, distinct, state, then boolean is probably not the right choice. Typically, some other (higher scoped) attribute would reflect "not-set".

@ptitzler
Copy link
Member Author

(Our comments crossed.)

node value rendering of True/False would always override the default because 'not-set' cannot be modeled

Yes. This is where I thought your original comment was heading and, if "not-set" is a third, distinct, state, then boolean is probably not the right choice. Typically, some other (higher scoped) attribute would reflect "not-set".

Yep, sorry. I also forgot that we already do this for the Disable node caching property.

When no pipeline default value is set, these are the choices a user has when configuring a node property value:

image

When a pipeline default value is set, these are the choices a user has when configuring a node property value (the default is pre-selected):

image

For the sake of consistency, should we always render booleans as select lists instead of checkboxes to avoid using different widgets to collect input for the same data type?

@kevin-bates
Copy link
Member

I find it confusing for the case when no pipeline default exists that neither True nor False are selected and I have no idea which of the two values would be used. In addition, when a pipeline default does exist, I find the "duplicated" value (True in this case) a little confusing and would rather there still just be two options - with the pipeline default indication placed on the applicable value.

Could we make a rule that all boolean properties must specify a default (and/or if a default is not specified, False is the default)? If that were the case, would this usability issue be resolved?

Copy link
Member

@kiersten-stokes kiersten-stokes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two minor comments below from me! Otherwise, code is looking good to me!

elyra/pipeline/airflow/processor_airflow.py Outdated Show resolved Hide resolved
elyra/pipeline/component_parameter.py Outdated Show resolved Hide resolved
@ptitzler
Copy link
Member Author

PR should now be in synch with main. The bloated commit history is a result of earlier synch-ups with the prerequisite PR #2957, while it was a work in progress.

@akchinSTC
Copy link
Member

@marthacryan - is the checkbox change made here or in the pipeline-editor repo?

@ptitzler ptitzler changed the title Extend data volume node property Pipeline Editor: Extend data volume node property Oct 11, 2022
@ptitzler
Copy link
Member Author

ptitzler commented Oct 11, 2022

re #2961 (comment)
We've discussed this in today's dev meeting. We'll need input from @marthacryan

  • First screen capture scenario: what would it take to support a label (e.g "use runtime default") that is associated with a null/none value? Currently the label text is persisted in the .pipeline file. The problem is that this text might change across releases and therefore would require migration. mostly resolved by 24ac087 (the node property rendering Use runtime default doesn't indicate yet when the displayed value is the pipeline default)
  • Second screen capture scenario: would it be possible to de-duplicate the select list. In the example shown only two choices would be rendered instead of three: "True (pipeline default)" and "False". The '(pipeline default)' suffix is only applied to a label if a pipeline default value is defined for the property. Pipeline editor: display semantically de-duplicated entries for enums #2966

@ptitzler
Copy link
Member Author

@marthacryan - is the checkbox change made here or in the pipeline-editor repo?

Discussed this in today's dev meeting. The change will be delivered in a separate PR for #2967

Copy link
Member

@kiersten-stokes kiersten-stokes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Mounts:
      /mnt/vol1 from test-2-volume (ro,path="sub-dir-1")
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-editor-token-jf5kd (ro)

@akchinSTC akchinSTC merged commit 3e33fdc into elyra-ai:main Oct 13, 2022
@ptitzler ptitzler deleted the extend-volume-property branch October 13, 2022 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:pipeline-editor pipeline editor kind:enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pipeline editor: extend data volume node property to support additional customizations
4 participants