-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multi-engine support #293
Conversation
a2ba22e
to
ce02d62
Compare
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## master #293 +/- ##
==========================================
+ Coverage 49.30% 52.78% +3.47%
==========================================
Files 69 67 -2
Lines 4468 4583 +115
==========================================
+ Hits 2203 2419 +216
+ Misses 2021 1893 -128
- Partials 244 271 +27
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taking a first pass; overall looks solid! excited to see this in
client.drivers[d.Name()] = d | ||
client.driverPriority[d.Name()] = len(client.drivers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seeing the way we use these maps throughout, I'm wondering if it'd be easier to have just on map of driver_name -> { driver, priority }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really.
There is the argument that it's harder to keep the state of two different maps in sync, except this function is the only place where these maps get altered.
This is counterbalanced against the fact that it'd be slightly harder to write something like driverForTemplate(), since you wouldn't be able to rely on conventions like a missing priority entry == 0, plus you'd need an additional struct.
In the end it's a wash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If d.Name() already exists as part of client.drivers, could this alter the driverPriority since the total len could be different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, I should have checked for uniqueness.
result := &types.Result{ | ||
Msg: fmt.Sprintf("rejected by driver %s: %s", d.name, d.code[strings.ToLower(constraint.GetObjectKind().GroupVersionKind().Kind)]), | ||
Constraint: constraint, | ||
// TODO: the engine should not determine the enforcement action -- that does not work with CEL KEP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
highlighting TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LMK if you want to see this done in this PR, I figured it'd be better as a follow-on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @sozercan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am okay to defer, just want to capture this and attach an issue so we don't forget
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added #298
@@ -0,0 +1,198 @@ | |||
package k8scel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mentioned this is pending controller-runtime update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. We can't use the K8s CEL code without the change that makes the result objects public (which is bleeding edge), and we can't use bleeding-edge k8s code until controller-runtime supports it.
return resp, fmt.Errorf("%w: available drivers: %v, wanted %q", clienterrors.ErrNoDriver, c.driverPriority, c.driverForTemplate(templ)) | ||
} | ||
|
||
// TODO: because different targets may have different code sets, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
highlighting TODO. you mentioned this is used for shared cache for referrential data, since this is out of scope for CEL at this time, can we create an issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a different TODO. This one is about supporting multi-target templates.
I probably wouldn't create a bug for this, since we are not actively working on multi-target templates ATM, and this would be part of that larger body of work. Having the TODO in the code is useful to avoid missing this nuance if/when we start working on it (issues are more likely to be missed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LMK if you'd prefer the issue though
if cachedCpy != nil && cachedCpy.SemanticEqual(templ) { | ||
// if there is more than one active driver for the template, there is some cleanup to do | ||
// from a botched driver swap. | ||
if cachedCpy != nil && cachedCpy.SemanticEqual(templ) && len(cached.activeDrivers) == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this only true when there is only 1 activeDrivers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the contents haven't change and there is only one active driver, there is no work to do and we can return early.
If the contents have changed, then there is work to do to update the template code.
If the contents have not change, but there is more than one active driver, then there is cleanup to do (only one driver should be active at a time).
If the contents have not changed, and there are zero active drivers... this state shouldn't happen, but we should add the template code to a driver so we have an active driver.
c.templates[templateName] = cacheEntry | ||
} | ||
|
||
cacheEntry.activeDrivers[newDriverN] = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is activeDrivers thread safe? does it need any locks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would not be thread safe, but I believe cacheEntry
is gated by the c.mtx
lock above?
|
||
err := driver.RemoveTemplate(ctx, template) | ||
if err != nil { | ||
return resp, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return early here prevents it from attempting to continue to remove it from other drivers. is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, the idea is that a failed removal needs to be met with a retry (as in a controller). If we cannot fully remove a template from a driver and cannot recover from that state, then the system is in an unknown state and the program should die (e.g. we have stale Rego code in OPA, who knows what impact it may have)
// storage. We should work to remove the need for this special case | ||
// by building a global storage object. Right now Rego needs its own | ||
// cache to cache constraints. | ||
if _, ok := c.drivers[regoSchema.Name]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if !ok is returned? do we want to add err to errMap at least? do we still want to call .AddData if the item is not found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is just saying "only call add/remove data for the Rego driver".
Having multiple drivers maintain separate caches would lead to excess RAM usage.
Ultimately, what we need is a cache that can be shared across drivers, but that is quite a bit of work, and not necessary for the CEL KEP (which has no notion of referential constraints).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we at least log it if it's not a rego driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern there would be log verbosity. For example: someone who uses Gatekeeper who watches pods would see a log line every time a pod was updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH, Gatekeeper would have the rego driver, so no need to log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
19851c7 LGTM. Can you pls resolve the file conflict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for addressing the comments I made 🥇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! LGTM
@maxsmythe do you mind if we merge #300 first so we can cut a release branch and tag for v3.12? |
Seems gatekeeper_test is failing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
one small nit and test failure
gatekeeper test is failing b/c of the renaming of the "local" driver package to "rego", so is expected. |
Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Max Smythe <smythe@google.com>
…supports it Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Max Smythe <smythe@google.com>
This PR allows us to have multiple engines registered with a constraint framework client, which is a prerequisite for integrating with the CEL KEP:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/3488-cel-admission-control