-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YUNIKORN-1793] Handle placement rule and queue changes during initialisation #601
Conversation
Codecov Report
@@ Coverage Diff @@
## master #601 +/- ##
==========================================
+ Coverage 77.46% 77.56% +0.10%
==========================================
Files 77 78 +1
Lines 12927 12977 +50
==========================================
+ Hits 10014 10066 +52
+ Misses 2593 2591 -2
Partials 320 320
... and 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
3ef792d
to
a2379d8
Compare
fdfde49
to
244d52c
Compare
Updated approach to move the queue placement logic entirely within the placeholder manager and incorporated YUNIKORN-19 (always use placement manager). This simplifies the logic considerably as the placement manager is now always active and if it returns a queue, that queue is always to be created if it does not exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A 'recovery' rule (not available in the configurable list) pointing to a fixed queue (cannot be created in the config) which has a fixed ACL matching to an invalid user that cannot be specified all based on a tag that the k8shim always sets based on the state of init will I think simplify this solution.
if err != nil { | ||
return fmt.Errorf("failed to create rule based queue %s for application %s", queueName, appID) | ||
if common.IsRecoveryQueue(queueName) { | ||
queue, err = pc.createRecoveryQueue() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that the recovery queue create should ever fail. It needs to always be there or be created otherwise we are in the same situation with an unrecoverable application again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not fail, however it's still possible (and is most definitely a bug if it does). Nothing configuration-related is checked, only sanity checks such as the root queue being provided properly.
244d52c
to
909ecb3
Compare
Rebased on latest master. |
89a304d
to
aca129e
Compare
…lisation Adds a tag to mark applications as forced create. This allows most validation to be suppressed and if necessary, will redirect the app to a recovery queue.
Refactored CreateRecoveryQueue / CreateDynamicQueue implementation
aca129e
to
9df92cf
Compare
Recovery queue now always fails ACL checks so it can only be submitted to if the recovery rule triggers it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM (after Wilfred's comments are addressed)
@wilfred-s I believe all the comments have been addressed. Can you do a final review? |
What is this PR for?
Adds a tag to mark applications as forced create. This allows most validation to be suppressed and if necessary, will redirect the app to a recovery queue.
What type of PR is it?
Todos
What is the Jira issue?
https://issues.apache.org/jira/browse/YUNIKORN-1793
How should this be tested?
Screenshots (if appropriate)
Questions: