Support K8 resource YAML config as input for DSL components #756

swiftdiaries · 2019-01-31T15:03:59Z

Part of issues related to having better support for k8s resources in Pipelines.

It'd better if we can re-use existing implementations for creating individual components.
For example, TFJob and most CRDs in Kubeflow already have a ksonnet prototype defined. If we could use these as input to create components, it'd be really cool.
Another use case is when you want to have features from Argo that are not yet implemented in Pipelines DSL. Like the docker-in-docker workflows for Seldon: YAML

Related issue: #677
Related discussion: https://kubeflow.slack.com/archives/CE10KS9M4/p1548777905172300

Ark-kun · 2019-02-01T06:17:19Z

CRs are somewhat limited as components.
Components should have inputs and outputs while CRs are pretty limited here.
The way both Pipelines (see kubeflow TFJob launcher) and Argo currently handle CRs is by using a controlling container.
We might probably provide an easier way to submit and wait for CRs in future, but the problems with inputs and outputs remains.

Ark-kun · 2019-02-01T06:59:34Z

Does this do what you want to accomplish?

import kfp.components as comp

create_seldon_deployment_op = comp.load_component_from_text(
'''
name: Create Seldon deployment
implementation:
  container:
    image: lachlanevenson/k8s-kubectl
    command: [sh, -c, 'echo "$0" | kubectl apply -f -']
    args:
    - |
      apiVersion: "machinelearning.seldon.io/v1alpha1"
      kind: "SeldonDeployment"
      ...
''')

def pipeline():
    create_seldon_deployment_task = create_seldon_deployment_op()

swiftdiaries · 2019-02-04T08:24:24Z

Thanks for the reply! With this, it solves the problem of deploying YAML. However, we don't get the associated logs / status for the deployment, we'll need to have another container that fetches the status information for the CRs. Managing k8s objects and CRs natively from the DSL should be a long term solution. If we could do that by wrapping around the k8s python client or Argo YAML instead rebuilding every feature, it would be quite helpful.

For example, for volume usage.

def use_local_volume(pvc_name='pipeline-claim', volume_name='pipeline', volume_mount_path='/mnt/pipeline'):
    """
        Modifier function to apply to a Container Op to simplify volume, volume mount addition and 
        enable better reuse of volumes, volume claims across container ops.
        Usage:
            train = train_op(...)
            train.apply(use_local_volume('claim-name', 'pipeline', '/mnt/pipeline'))
    """
    def _use_local_volume(task):
        from kubernetes import client as k8s_client
        local_pvc = k8s_client.V1PersistentVolumeClaimVolumeSource(claim_name=pvc_name)
        return (
            task
                .add_volume(
                    k8s_client.V1Volume(name=volume_name, persistent_volume_claim=local_pvc)
                )
                .add_volume_mount(
                    k8s_client.V1VolumeMount(mount_path=volume_mount_path, name=volume_name)
                )
        )
    return _use_local_volume

Then to define and use the volume, we could do something like:

# Volume definition
local_volume = use_local_volume('pipeline-claim', 'pipeline', '/mnt/pipeline')

# Add volume to ContainerOp named load_data
load_data.apply(local_volume)

It makes something verbose and long winded to be more readable and easy to write. This is one approach I guess. There's still no way to "see" how these created / attached resources behave in our pipeline. Or even if they exist. So it would be good to add those kind of features.

TL;DR - Allow for a way from the DSL to create and manage k8s resources in a more readable and less verbose manner. One way is to abstract the k8s python client from the DSL.

vicaire · 2019-02-13T09:01:02Z

@swiftdiaries, would something like this work:

At the low level, we provide a generic container that takes a CRD spec as a parameter and runs it to completion.
The DSL can implement some special, easier to use abstraction on top of that on a case by case basis, for the CRDs that are used the most.
For debugging, the UI should be able to view/follow the tree of K8 resources (container labels indicated the Argo workflow that created them, custom resource labels indicate which containers created them)

vicaire · 2019-02-13T09:09:19Z

Note that ksonnet is being archived (https://ksonnet.io/) so we probably won't spend time integrating the DSL with ksonnet.

elikatsis · 2019-09-04T09:38:51Z

@swiftdiaries, Hello!

It's been some time, so let's see what we've got and what more do we need to close this issue.

Regarding the volume usage you mention in this comment:

You've implemented the solution you proposed in Adds a modifier function to simplify addition of local volumes to containerop #783
Also, there are pvolumes semantics:
- ContainerOp constructor argument introduced by Extend the DSL to implement the design of #801 #926
- add_pvolumes() method introduced by SDK/Compiler: Add add_pvolumes() method to ContainerOp #1353

As far as the K8s resources are concerned, which is the main reason of this issue, I think it is covered by ResourceOps (introduced by #926). These implement Argo's resource template, for K8s resource manipulation and there's logging by Argo.

Finally, #879, which introduced sidecars, enabled the use of full K8s specifications through its client for containers.

So, do these cover this issue? Is there something I'm missing?

swiftdiaries · 2019-09-05T03:21:22Z

@elikatsis I think this can be closed :)
Thanks for the ping !

* Update README.md typos. Signed-off-by: Jenna Ritten <jritten@ibm.com> * Update README.md typos. Signed-off-by: Jenna Ritten <jritten@ibm.com> * Update README.md Co-authored-by: Tommy Li <Tommy.chaoping.li@ibm.com> * Apply suggestions from code review Co-authored-by: Tommy Li <Tommy.chaoping.li@ibm.com>

Ark-kun self-assigned this Feb 1, 2019

vicaire assigned hongye-sun Mar 26, 2019

vicaire added area/components area/sdk/dsl priority/p2 labels Mar 26, 2019

vicaire changed the title ~~Support YAML/ksonnet config as input for DSL components~~ Support K8 resource YAML config as input for DSL components Mar 26, 2019

ryandawsonuk mentioned this issue May 30, 2019

Seldon examples #1405

Merged

Ark-kun assigned gaoning777 and elikatsis and unassigned Ark-kun Sep 3, 2019

swiftdiaries closed this as completed Sep 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support K8 resource YAML config as input for DSL components #756

Support K8 resource YAML config as input for DSL components #756

swiftdiaries commented Jan 31, 2019

Ark-kun commented Feb 1, 2019

Ark-kun commented Feb 1, 2019

swiftdiaries commented Feb 4, 2019

vicaire commented Feb 13, 2019

vicaire commented Feb 13, 2019

elikatsis commented Sep 4, 2019

swiftdiaries commented Sep 5, 2019

Support K8 resource YAML config as input for DSL components #756

Support K8 resource YAML config as input for DSL components #756

Comments

swiftdiaries commented Jan 31, 2019

Ark-kun commented Feb 1, 2019

Ark-kun commented Feb 1, 2019

swiftdiaries commented Feb 4, 2019

vicaire commented Feb 13, 2019

vicaire commented Feb 13, 2019

elikatsis commented Sep 4, 2019

swiftdiaries commented Sep 5, 2019