Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-engine support #293

Merged
merged 6 commits into from
Mar 4, 2023
Merged

Conversation

maxsmythe
Copy link
Contributor

This PR allows us to have multiple engines registered with a constraint framework client, which is a prerequisite for integrating with the CEL KEP:

https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/3488-cel-admission-control

@codecov-commenter
Copy link

codecov-commenter commented Feb 16, 2023

Codecov Report

Patch coverage has no change and project coverage change: +3.47 🎉

Comparison is base (89ae905) 49.30% compared to head (3a6d7ca) 52.78%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #293      +/-   ##
==========================================
+ Coverage   49.30%   52.78%   +3.47%     
==========================================
  Files          69       67       -2     
  Lines        4468     4583     +115     
==========================================
+ Hits         2203     2419     +216     
+ Misses       2021     1893     -128     
- Partials      244      271      +27     
Flag Coverage Δ
unittests 52.78% <ø> (+3.47%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...nt/frameworks/constraint/pkg/client/client_opts.go 80.95% <0.00%> (-19.05%) ⬇️
...y-agent/frameworks/constraint/pkg/client/client.go 85.19% <0.00%> (-3.77%) ⬇️
...y-agent/frameworks/constraint/pkg/client/errors.go 0.00% <0.00%> (ø)
...ent/frameworks/constraint/pkg/client/new_client.go 100.00% <0.00%> (ø)
.../pkg/apis/templates/v1/constrainttemplate_types.go 100.00% <0.00%> (ø)
...apis/templates/v1beta1/constrainttemplate_types.go 100.00% <0.00%> (ø)
...pis/templates/v1alpha1/constrainttemplate_types.go 100.00% <0.00%> (ø)
...rks/constraint/pkg/client/drivers/remote/remote.go
...eworks/constraint/pkg/client/drivers/local/rego.go
...meworks/constraint/pkg/client/drivers/local/new.go
... and 23 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@acpana acpana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taking a first pass; overall looks solid! excited to see this in

constraint/pkg/client/client.go Show resolved Hide resolved
constraint/pkg/client/client.go Show resolved Hide resolved
constraint/pkg/client/client.go Show resolved Hide resolved
constraint/pkg/client/client.go Outdated Show resolved Hide resolved
constraint/pkg/client/client.go Outdated Show resolved Hide resolved
Comment on lines +60 to +64
client.drivers[d.Name()] = d
client.driverPriority[d.Name()] = len(client.drivers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seeing the way we use these maps throughout, I'm wondering if it'd be easier to have just on map of driver_name -> { driver, priority }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really.

There is the argument that it's harder to keep the state of two different maps in sync, except this function is the only place where these maps get altered.

This is counterbalanced against the fact that it'd be slightly harder to write something like driverForTemplate(), since you wouldn't be able to rely on conventions like a missing priority entry == 0, plus you'd need an additional struct.

In the end it's a wash.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If d.Name() already exists as part of client.drivers, could this alter the driverPriority since the total len could be different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I should have checked for uniqueness.

constraint/pkg/client/drivers/dummy/dummy.go Outdated Show resolved Hide resolved
constraint/pkg/client/drivers/interface.go Show resolved Hide resolved
constraint/pkg/client/drivers/rego/driver.go Show resolved Hide resolved
result := &types.Result{
Msg: fmt.Sprintf("rejected by driver %s: %s", d.name, d.code[strings.ToLower(constraint.GetObjectKind().GroupVersionKind().Kind)]),
Constraint: constraint,
// TODO: the engine should not determine the enforcement action -- that does not work with CEL KEP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

highlighting TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMK if you want to see this done in this PR, I figured it'd be better as a follow-on

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @sozercan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am okay to defer, just want to capture this and attach an issue so we don't forget

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added #298

@@ -0,0 +1,198 @@
package k8scel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mentioned this is pending controller-runtime update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. We can't use the K8s CEL code without the change that makes the result objects public (which is bleeding edge), and we can't use bleeding-edge k8s code until controller-runtime supports it.

return resp, fmt.Errorf("%w: available drivers: %v, wanted %q", clienterrors.ErrNoDriver, c.driverPriority, c.driverForTemplate(templ))
}

// TODO: because different targets may have different code sets,
Copy link
Member

@sozercan sozercan Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

highlighting TODO. you mentioned this is used for shared cache for referrential data, since this is out of scope for CEL at this time, can we create an issue for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a different TODO. This one is about supporting multi-target templates.

I probably wouldn't create a bug for this, since we are not actively working on multi-target templates ATM, and this would be part of that larger body of work. Having the TODO in the code is useful to avoid missing this nuance if/when we start working on it (issues are more likely to be missed).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMK if you'd prefer the issue though

if cachedCpy != nil && cachedCpy.SemanticEqual(templ) {
// if there is more than one active driver for the template, there is some cleanup to do
// from a botched driver swap.
if cachedCpy != nil && cachedCpy.SemanticEqual(templ) && len(cached.activeDrivers) == 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this only true when there is only 1 activeDrivers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the contents haven't change and there is only one active driver, there is no work to do and we can return early.

If the contents have changed, then there is work to do to update the template code.

If the contents have not change, but there is more than one active driver, then there is cleanup to do (only one driver should be active at a time).

If the contents have not changed, and there are zero active drivers... this state shouldn't happen, but we should add the template code to a driver so we have an active driver.

c.templates[templateName] = cacheEntry
}

cacheEntry.activeDrivers[newDriverN] = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is activeDrivers thread safe? does it need any locks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would not be thread safe, but I believe cacheEntry is gated by the c.mtx lock above?


err := driver.RemoveTemplate(ctx, template)
if err != nil {
return resp, err
Copy link
Member

@ritazh ritazh Feb 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return early here prevents it from attempting to continue to remove it from other drivers. is this intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the idea is that a failed removal needs to be met with a retry (as in a controller). If we cannot fully remove a template from a driver and cannot recover from that state, then the system is in an unknown state and the program should die (e.g. we have stale Rego code in OPA, who knows what impact it may have)

// storage. We should work to remove the need for this special case
// by building a global storage object. Right now Rego needs its own
// cache to cache constraints.
if _, ok := c.drivers[regoSchema.Name]; ok {
Copy link
Member

@ritazh ritazh Feb 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if !ok is returned? do we want to add err to errMap at least? do we still want to call .AddData if the item is not found?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is just saying "only call add/remove data for the Rego driver".

Having multiple drivers maintain separate caches would lead to excess RAM usage.

Ultimately, what we need is a cache that can be shared across drivers, but that is quite a bit of work, and not necessary for the CEL KEP (which has no notion of referential constraints).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we at least log it if it's not a rego driver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern there would be log verbosity. For example: someone who uses Gatekeeper who watches pods would see a log line every time a pod was updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH, Gatekeeper would have the rego driver, so no need to log

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

19851c7 LGTM. Can you pls resolve the file conflict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Resolved.

@maxsmythe maxsmythe requested review from sozercan, acpana and ritazh and removed request for acpana February 27, 2023 23:05
Copy link
Contributor

@acpana acpana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for addressing the comments I made 🥇

Copy link
Member

@sozercan sozercan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! LGTM

@sozercan
Copy link
Member

sozercan commented Mar 2, 2023

@maxsmythe do you mind if we merge #300 first so we can cut a release branch and tag for v3.12?

@ritazh
Copy link
Member

ritazh commented Mar 4, 2023

Seems gatekeeper_test is failing?

Copy link
Member

@ritazh ritazh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
one small nit and test failure

@maxsmythe
Copy link
Contributor Author

gatekeeper test is failing b/c of the renaming of the "local" driver package to "rego", so is expected.

Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Max Smythe <smythe@google.com>
…supports it

Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Max Smythe <smythe@google.com>
@maxsmythe maxsmythe merged commit d82cbe1 into open-policy-agent:master Mar 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants