Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: validate SecurityPolicy on controller and egctl translate #4987

Merged
merged 8 commits into from
Jan 10, 2025

Conversation

sanposhiho
Copy link
Collaborator

What type of PR is this?

What this PR does / why we need it:

  • ValidateSecurityPolicy exists, but isn't called like other validation functions (ValidateEnvoyProxy etc).
  • Fix the validation; Currently, all SecurityPolicys without CORS or JWT are considered invalid and failed at the validation.

Which issue(s) this PR fixes:

Fixes #

Release Notes: No

Signed-off-by: Kensei Nakada <handbomusic@gmail.com>
@sanposhiho sanposhiho marked this pull request as ready for review December 31, 2024 09:30
@sanposhiho sanposhiho requested a review from a team as a code owner December 31, 2024 09:30
Copy link

codecov bot commented Dec 31, 2024

Codecov Report

Attention: Patch coverage is 23.80952% with 16 lines in your changes missing coverage. Please review.

Project coverage is 66.72%. Comparing base (24a50b4) to head (faa9e21).
Report is 31 commits behind head on main.

Files with missing lines Patch % Lines
api/v1alpha1/validation/securitypolicy_validate.go 33.33% 8 Missing ⚠️
internal/gatewayapi/securitypolicy.go 11.11% 7 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main    #4987    +/-   ##
========================================
  Coverage   66.71%   66.72%            
========================================
  Files         209      209            
  Lines       32052    32379   +327     
========================================
+ Hits        21384    21605   +221     
- Misses       9388     9468    +80     
- Partials     1280     1306    +26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Kensei Nakada <handbomusic@gmail.com>
@sanposhiho sanposhiho force-pushed the security-policy-validation branch from 332b28e to 0e44d82 Compare January 5, 2025 04:36
Signed-off-by: Kensei Nakada <handbomusic@gmail.com>
Signed-off-by: Kensei Nakada <handbomusic@gmail.com>
Signed-off-by: Kensei Nakada <handbomusic@gmail.com>
@sanposhiho sanposhiho force-pushed the security-policy-validation branch from 551e107 to 25eeb94 Compare January 9, 2025 03:34
arkodg
arkodg previously approved these changes Jan 9, 2025
Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks !

@arkodg arkodg requested review from a team January 9, 2025 03:42
zhaohuabing
zhaohuabing previously approved these changes Jan 10, 2025
Copy link
Member

@zhaohuabing zhaohuabing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

IMO, the validation should be removed from the api package and moved into the gateway API translation package. This can be addressed in a follow-up PR.

@sanposhiho
Copy link
Collaborator Author

sanposhiho commented Jan 10, 2025

Another question around the validation is that, shouldn't we do it with a validation webhook instead of at the reconciliation?
I mean, I know it's a tradeoff, for example, between "GW users can immediately/easily find the mistakes at applying" vs "admin has to maintain the webhook properly (e.g., autoscaling etc)" though.

Signed-off-by: Kensei Nakada <handbomusic@gmail.com>
Signed-off-by: Kensei Nakada <handbomusic@gmail.com>
@sanposhiho sanposhiho dismissed stale reviews from zhaohuabing and arkodg via d5901e5 January 10, 2025 03:45
@sanposhiho sanposhiho force-pushed the security-policy-validation branch from 1cbda49 to d5901e5 Compare January 10, 2025 03:45
@@ -53,10 +64,6 @@ func validateSecurityPolicySpec(spec *egv1a1.SecurityPolicySpec) error {
return utilerrors.NewAggregate(errs)
}

if err := ValidateJWTProvider(spec.JWT.Providers); err != nil {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet another bug: It panics if spec.JWT is nil.

@sanposhiho
Copy link
Collaborator Author

@arkodg @zhaohuabing Sorry I had missed the test was failing.
I just fixed them, please retake a look.

@sanposhiho sanposhiho force-pushed the security-policy-validation branch from 7302818 to c54e177 Compare January 10, 2025 06:09
Signed-off-by: Kensei Nakada <handbomusic@gmail.com>
@sanposhiho sanposhiho force-pushed the security-policy-validation branch from c54e177 to faa9e21 Compare January 10, 2025 06:10
@zhaohuabing
Copy link
Member

Another question around the validation is that, shouldn't we do it with a validation webhook instead of at the reconciliation? I mean, I know it's a tradeoff, for example, between "GW users can immediately/easily find the mistakes at applying" vs "admin has to maintain the webhook properly (e.g., autoscaling etc)" though.

I guess another reason probably is that Webhook can not be used for the host deployment mode. @arkodg should have more context on this.

@shawnh2
Copy link
Contributor

shawnh2 commented Jan 10, 2025

The PR looks good!

But it confuses me that why we have this SecurityPolicy validations here ? I can understand the existence of EnvoyGateway validations, and EnvoyProxy validations (which can be removed once the CEL expression is stable).

@sanposhiho
Copy link
Collaborator Author

But it confuses me that why we have this SecurityPolicy validations here ?

So, what would you propose instead?

@arkodg
Copy link
Contributor

arkodg commented Jan 10, 2025

outlining the steps EG performs - Receives Input (1) and Translates it (2)
a. We rely on OpenAPI validations to validate input in 1. These are generated using kubebuilder tags and CEL validations (which can represent complex validations) . Most implementations have moved their their webhook validation logic to CEL to simplify operations (out of process vs inline), so the config still gets rejected synchronously during apply.
b. For some cases where we cannot define CEL we use the Resource Status to flag any validations errors we catch during 2. e.g. targetRef of a policy specifying a non existant resource, which cannot be computed by CEL and requires more info.
c. EG's philosophy is we rely on these OpenAPI validations baked into the CRD which apply in 1. and we dont need to rewrite these validations in 2. So a case of spec being nil is NA in Kubernetes because the Kube API server will reject such a config so it should never reach 2.
d. The issue however does exist for the File Provider case, where we read config from a file in 1. , the OpenAPI validations are not kicking in, this is being tracked with #4858 and needs some more exploration

@arkodg arkodg merged commit 8dbe6e0 into envoyproxy:main Jan 10, 2025
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants