Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support trial meta injection in trial template rendering #1259

Merged
merged 8 commits into from
Jul 27, 2020

Conversation

sperlingxx
Copy link
Member

@sperlingxx sperlingxx commented Jul 9, 2020

After the refactor of trial template rendering in v1beta1, users can only fetch nothing but hyperparameter values. But, in v1alpha3, users can fetch trial name via placeholder ({{.Trial}}), which is quite useful in many conditions, such as appending it at the end of model storage url/path to avoid model overwriting.

This PR is trying to bring this feature back through additional trial template transformation, which is specialized for metadata injection.

cc @andreyvelich @gaocegege @johnugeorge

@kubeflow-bot
Copy link

This change is Reviewable

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/assign @andreyvelich

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sperlingxx.
I agree that Trial name and namespace might be useful for some cases.
Maybe we could include these parameters to TrialParameters and user can explicitly see which parameters will be replaced in TrialSpec?

I propose to name these parameters: TrialName and TrialNamespace to not block TrialParameters[].Name = "Name". Maybe one of the training container input parameter is Neural Network name. For example, user has various NN names and wants to find which is the best.

For these parameters, we should change Reference. My suggestion is: ${trialSpec.metadata.name} or ${trialSpec.name}. So user can certain see how this parameter will be replaced.

What do you think @sperlingxx @gaocegege ?

@gaocegege
Copy link
Member

${trialSpec.metadata.name} or ${trialSpec.name} LGTM.

@sperlingxx
Copy link
Member Author

@andreyvelich @gaocegege I just back from vacation. For me, using ${trialSpec.metadata.name} as reference of trial name is a brilliant idea. I will follow this idea to make changes.

@sperlingxx sperlingxx force-pushed the trial_meta_injection branch from 92ead2c to 310869b Compare July 17, 2020 00:58
@sperlingxx
Copy link
Member Author

@andreyvelich @gaocegege I think this PR is ready. After refactoring, mapping relation between placeholder (reference) of trial metadata and trial metadata is defined by buildTrialMetaForRunSpec.

@andreyvelich
Copy link
Member

andreyvelich commented Jul 17, 2020

@sperlingxx Thank you for updating this!
As I mentioned here: #1259 (review), do we want to add these meta parameters in TrialParameter and our substitution in TrialSpec will be consistent?

So my suggestion is having something like this:

trialTemplate:
  trialParameters:
    - name: learningRate
      description: Learning rate for the training model
      reference: lr
    - name: trialName
      description: Name of the Trial
      reference: ${trialSpec.metadata.name}
    - name: trialNamespace
      description: Namespace of the Trial
      reference: ${trialSpec.metadata.namespace}
    - name: trialKind
      description: Kind of the Trial
      reference: ${trialSpec.kind}
    - name: trialAPIVersion
      description: API Version of the Trial
      reference: ${trialSpec.apiVersion}
  trialSpec:
    apiVersion: batch/v1
    kind: Job
    spec:
      template:
        spec:
          containers:
            - name: training-container
              image: docker.io/kubeflowkatib/mxnet-mnist
              command:
                - "python3"
                - "/opt/mxnet-mnist/mnist.py"
                - "--batch-size=64"
                - "--lr=${trialParameters.learningRate}"
              env:
                - name: Name
                  value: ${trialParameters.trialName}
                - name: Namespace
                  value: ${trialParameters.trialNamespace}
                - name: Kind
                  value: ${trialParameters.trialKind}
                - name: APIVersion
                  value: ${trialParameters.trialAPIVersion}
          restartPolicy: Never

Advantages of this approach:

  1. User can explicitly see which parameters will be replaced by parsing trialParameters.
  2. Easier to validate Trial Spec (Just check that TrialSpec has all names from TrialParameters.
  3. This approach can be extensible with Reference. (In the future we can design unify function to parse reference and get appropriate values from TrialSpec).
  4. User can name these parameters as he/she wish. For example:
...
 - name: katibTrialName
   description: Name of the Trial
   reference: ${trialSpec.metadata.name}
...
env:
  - name: Name
     value: ${trialParameters.katibTrialName}
...

What do you think @sperlingxx @gaocegege @johnugeorge ?

Tekton also uses substitution from params (https://github.com/tektoncd/pipeline/blob/master/examples/v1beta1/pipelineruns/pipelinerun-with-params.yaml#L1-L29).

@sperlingxx
Copy link
Member Author

@andreyvelich I misunderstood the meaning of Reference you mentioned in #1259 (review). Will it introduce unnecessary complexity for users in this approach? Considering they need to map trial metadata with placeholders explicitly in trialParamenter definition.
If we apply this approach, I suggest the reference schema can keep align with kubernetes object meta API. For instance,

  • ${trialSpec.Name} represents name of trial (job)
  • ${trialSpec.Namespace} represents namesparce of trial (job)
  • ${trialSpec.APIVersion} represents APIVersion of job
  • ${trialSpec.Kind} represents kind of job
  • ${trialSpec.Annotations["xxx"]} refers to value of annotation field "xxx" in job spec.
  • ${trialSpec.Labels["xxx"]} refers to value of label field "xxx" in job spec.

@andreyvelich
Copy link
Member

@sperlingxx Reference was originally proposed by @gaocegege, check comment here: #1202 (comment).
Originally, it was designed to make connection between trialParameters[x].Name and ParameterAssignment that Suggestion returns.
I was thinking maybe this parameter make sense in case of this feature.
Any thoughts @gaocegege @johnugeorge ?

@sperlingxx
Copy link
Member Author

sperlingxx commented Jul 20, 2020

@sperlingxx Reference was originally proposed by @gaocegege, check comment here: #1202 (comment).
Originally, it was designed to make connection between trialParameters[x].Name and ParameterAssignment that Suggestion returns.
I was thinking maybe this parameter make sense in case of this feature.
Any thoughts @gaocegege @johnugeorge ?

@andreyvelich Generally, I agree with that. And in latest commit, I have applied trial meta by reference approach with the style I proposed above.

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sperlingxx Thank you very much for updating this!
I left few comments.

pkg/controller.v1beta1/experiment/manifest/generator.go Outdated Show resolved Hide resolved
pkg/controller.v1beta1/experiment/manifest/generator.go Outdated Show resolved Hide resolved
pkg/controller.v1beta1/experiment/manifest/generator.go Outdated Show resolved Hide resolved
pkg/controller.v1beta1/experiment/manifest/generator.go Outdated Show resolved Hide resolved
@@ -7,88 +7,88 @@ package mock
import (
context "context"
gomock "github.com/golang/mock/gomock"
v1alpha3 "github.com/kubeflow/katib/pkg/apis/manager/v1alpha3"
api_v1_alpha3 "github.com/kubeflow/katib/pkg/apis/manager/v1alpha3"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know, which version of mockgen are you using?
Why it is updated few naming ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I use the master branch of mockgen.

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sperlingxx.
/lgtm
/cc @gaocegege @johnugeorge

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants