Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Katib Launcher Experiment Name Conflict #3508

Merged
merged 5 commits into from
May 5, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions components/kubeflow/katib-launcher/src/launch_experiment.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
import os
import logging
import yaml
import uuid
import launch_crd

from kubernetes import client as k8s_client
Expand Down Expand Up @@ -99,11 +100,13 @@ def main(argv=None):
config.load_incluster_config()
api_client = k8s_client.ApiClient()
experiment = Experiment(version=args.version, client=api_client)
exp_name = (args.name+'-'+uuid.uuid4().hex)[0:63]

inst = {
"apiVersion": "%s/%s" % (ExperimentGroup, args.version),
"kind": "Experiment",
"metadata": {
"name": args.name,
"name": exp_name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think names might be limited to 63 characters max.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the current implementation would leave 31 characters for user.
Would you advise on appending a smaller UUID instead of 32 character long?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name should be explicitly truncated. For example, exp_name[0:63].
BTW, what do you think about just using generated names?
For example you can have

"generateName": sanitize(arg.name[0:31]) + '-',

Or even use fully generated names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, actually generateName is better!
I tested generateName with a Job template and it works okay with kubectl create command but not apply command which probably makes sense because resource has not been created yet!
And I checked this code is creating the resource in experiment.create() so it should work okay with this too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but actually, args.name is also being used in wait_for_condition() and delete() methods and I am not sure how it would work if the name is randomly generated at the execution time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value from create might have the actual name.

P.S. I'm OK with keeping your name generation, but I'd like to prevent the length from going over the limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it so that the experiment name is limited to 64 characters.

"namespace": args.namespace,
},
"spec": {
Expand All @@ -121,7 +124,7 @@ def main(argv=None):

expected_conditions = ["Succeeded", "Failed"]
current_inst = experiment.wait_for_condition(
args.namespace, args.name, expected_conditions,
args.namespace, exp_name, expected_conditions,
timeout=datetime.timedelta(minutes=args.experimentTimeoutMinutes))
expected, conditon = experiment.is_expected_conditions(current_inst, ["Succeeded"])
if expected:
Expand All @@ -131,7 +134,7 @@ def main(argv=None):
with open(args.outputFile, 'w') as f:
f.write(json.dumps(paramAssignments))
if args.deleteAfterDone:
experiment.delete(args.name, args.namespace)
experiment.delete(exp_name, args.namespace)

if __name__== "__main__":
main()