-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete Suggestion deployment after Experiment is finished #1150
Delete Suggestion deployment after Experiment is finished #1150
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for doing this!
I left few comments.
@@ -49,9 +49,11 @@ type ExperimentSpec struct { | |||
|
|||
NasConfig *NasConfig `json:"nasConfig,omitempty"` | |||
|
|||
// If false/true, which means delete/resume Suggestion after experiment is finished | |||
ResumeExperiment bool `json:"resumeExperiment,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use this parameter in the future PRs also for #1062?
For example:
ResumeExperiment: "never"
- Always delete Suggestion deployment after Experiment is finished and never restart it.
ResumeExperiment: "alwaysRunning"
- Suggestion deployment is in always running mode.
ResumeExperiment: "fromPersistentVolume"
- Suggestion deployment is deleted after Experiment is finished and we can restore it from PV.
This is just my thought, we can think about better naming.
/cc @gaocegege @johnugeorge @richardsliu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreyvelich Roughly, this design sounds to me. There are a few things I want to double check:
- According to bottom comment, when
ResumeExperiment: "never"
, suggestion instance itself will still be alive with terminated status after experiment who owns it finished. We only evict deployment and service. - If
ResumeExperiment: "fromPersistentVolume"
, we will try to retrieve suggestion status from PV when initializing suggestion instance. And we will preserve the suggestion instance into PV after the termination of experiment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sperlingxx Yeah, it is correct.
As I mentioned bellow, should Suggestion CR be always running? What do you think?
Even if we want to delete deployment after Experiment is finished, Suggestion CR still has useful information for the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreyvelich I agree on keeping Suggestion CR. It may be useful, and it will be deleted after the remove of experiment CR who owns it.
@@ -49,9 +49,11 @@ type ExperimentSpec struct { | |||
|
|||
NasConfig *NasConfig `json:"nasConfig,omitempty"` | |||
|
|||
// If false/true, which means delete/resume Suggestion after experiment is finished | |||
ResumeExperiment bool `json:"resumeExperiment,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use this parameter in the future PRs also for #1062?
For example:
ResumeExperiment: "never"
- Always delete Suggestion deployment after Experiment is finished and never restart it.
ResumeExperiment: "alwaysRunning"
- Suggestion deployment is in always running mode.
ResumeExperiment: "fromPersistentVolume"
- Suggestion deployment is deleted after Experiment is finished and we can restore it from PV.
This is just my thought, we can think about better naming.
What do you think @sperlingxx ?
/cc @gaocegege @johnugeorge @richardsliu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is already updated.
@andreyvelich request another review, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this change. I left few comments
|
||
const ( | ||
NeverResume ResumePolicyType = "Never" | ||
LongRunning ResumePolicyType = "LongRunning" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the name ResumePolicy
.
What do you think about the name LongRunning
? Is it better than AlwaysRunning
?
/cc @johnugeorge @gaocegege @richardsliu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gaocegege @johnugeorge @richardsliu Do you have any ideas about the naming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make it more consistent, how about keeping options to "Never" and "Always"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnugeorge As which proposed in #1062 , there will be the third policy type representing "resuming experiment from persistent format"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @sperlingxx.
In case when Experiment is not deleted and always running, ResumeExperiment: Always
sounds not very obviously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM. I think we can use LongRunning
4d35ba6
to
590deaf
Compare
@johnugeorge @andreyvelich Any other suggestion? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Sorry for being late. |
Good idea! But I'm busy with other business these days, I think I can offer the test case in early (maybe mid) May. |
I think we can merge this PR and add e2e test in the future PR. |
Of course. I think we can. /cc @johnugeorge |
Sounds good |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnugeorge The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR responds to the #1061 issue.
I'd like to implement suggestion eviction at first. Then work on how to persist the status of suggestion #1062 (maybe through pv?).
@andreyvelich @johnugeorge @richardsliu @hougangliu @gaocegege