-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
admission.Webhook always returns HTTP 200, making failurePolicy=Ignore moot #2662
Comments
/kind support |
This doesn't seem to be explicit to controller-runtime but webhooks in general. Webhooks always return 200. Are you expecting a different response or the failure policy is not adhering to the webhook config? |
This issue highlights the gardener/gardener#7013 issue you may be facing around a discussion where this took place. |
Yes, that's highlighting the same issue:
What I'd like is a way to signal that the webhook is not working, e.g. by returning a 500. Then it's left up to the failurePolicy of the webhook to determine if the request is allowed or not. Here's a concrete, albeit contrived example: let's say I'm introducing a webhook that reaches out to another service as part of its logic. The webhook pods require a network policy to allow ingress from the apiserver, and another one to allow egress to the other service. And let's say I get the port wrong on the ingress policy so the pods are unreachable. If the webhook is configured with failurePolicy=Ignore, the request is allowed. But if I messed up the port on the egress policy instead, the request to the external service would fail, and the webhook would attempt to return a 500. To me these are the same class of misconfiguration that failurePolicy=Ignore should handle, but in the latter case the request is ultimately rejected by the apiserver. |
That’s not what FailurePolicy does. This request is an upstream issue and not indicative of how controller runtime webhooks work. In your example, your accept/reject criteria is defined in the logic that you use to implement the interface. You can write a custom validator and utilize logic from the external service to use errors you defined from the response type and reject/accept from there. If you need to ignore those errors you will not get that from the failure policy. If you wish to submit an issue around this upstream, that API would need to be changed to accommodate your request. The 200 response is showing that webhook indeed succeeded in handling the request as described in the issue linked, discussion in slack and api design of admission review requests. |
Why is there a distinction between a failure which occurs before the request gets to the handler, and a failure in the handler itself? That's what I was trying to illustrate in my example. In either situation it is a logical failure of the webhook to generate an admission decision. Why do you feel it is an upstream issue? If controller runtime provided a hook to return a HTTP 500 when the webhook is failing and unable to make a decision, then it would follow the same apiserver code path as a timeout/failure to reach the webhook, and the ultimate decision would be based on the webhook's failure policy. |
Returning an error rejects the request. If you wanted the request to not get rejected, you would return nil. If you want to do some error logic around those parameters within your webhook, you can do
That's where failurePolicy comes in.
Handling the request and what
This is an upstream issue because webhooks that function with core are inherited within the scope to behave just as upstream intends to and abide by the same dependency we carry implementing within controller-runtime. Implementing new logic to utilize the field failurePolicy to do what you are asking is in fact changing the behavior of what we understand the field to do with the unreachable webhook. There is a path forward in your case. Controller-runtime provides the interface you can implement to reject/accept requests and how that happens is up to you and not derived from the http status code from the "error" that is handed back where failurePolicy wouldn't do anything with that. If your issue is with failurePolicy and not the implementation of the webhook strategy provided by controller-runtime, this is an issue that upstream would have to resolve as that behavior is what controller-runtime provides. |
You can also take advantage of admission.Warnings to try out webhook rejections when request fail. |
Just to make sure we're on the same page, I'm trying to achieve (2):
An example patch might look something like: diff --git a/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/http.go b/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/http.go
index 57e465ab..38df9c29 100644
--- a/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/http.go
+++ b/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/http.go
@@ -96,6 +96,12 @@ func (wh *Webhook) ServeHTTP(w http.ResponseWriter, r *http.Request) {
wh.getLogger(&req).V(5).Info("received request")
reviewResponse = wh.Handle(ctx, req)
+ if reviewResponse.internalServerError != nil {
+ wh.getLogger(nil).Error(err, "internal server error handling request")
+ http.Error(w, reviewResponse.internalServerError.Error(), http.StatusInternalServerError)
+ return
+ }
+
wh.writeResponseTyped(w, reviewResponse, actualAdmRevGVK)
}
diff --git a/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/response.go b/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/response.go
index ec1c88c9..d3758550 100644
--- a/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/response.go
+++ b/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/response.go
@@ -59,6 +59,16 @@ func Errored(code int32, err error) Response {
}
}
+// InternalServerError constructs a response indicating that the an internal
+// given error occurred while handling the request, and an admission decision
+// could not be generated. The API server will treat it as a failure to call
+// the webhook.
+func InternalServerError(err error) Response {
+ return Response{
+ internalServerError: err,
+ }
+}
+
// ValidationResponse returns a response for admitting a request.
func ValidationResponse(allowed bool, message string) Response {
code := http.StatusForbidden
diff --git a/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/webhook.go b/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/webhook.go
index f1767f31..3d2a7d4d 100644
--- a/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/webhook.go
+++ b/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/admission/webhook.go
@@ -62,6 +62,11 @@ type Response struct {
// AdmissionResponse is the raw admission response.
// The Patch field in it will be overwritten by the listed patches.
admissionv1.AdmissionResponse
+
+ // internalServerError indicates an internal error occurred while processing
+ // request, and an admission decision could not be generated. If set, a HTTP
+ // 500 will be returned instead of an admission decision.
+ internalServerError error
}
// Complete populates any fields that are yet to be set in
Then in a situation where my handler is unable to process the request it would return: allow, err := doWebhookLogic(ctx, req)
if err != nil {
return admission.InternalServerError(err)
}
if !allow {
if shadowDenyMode {
return admission.Allowed("").WithWarnings("some warning about being blocked in the future")
}
return admission.Denied("denied")
}
return admission.Allowed("") The handler itself isn't trying to consider failurePolicy or anything like that, it's just trying to communicate that it's failing (for whatever reason) and cannot make an admission decision. |
With your example, this now returns a non 200 ok response. There is no other non 200 response in this implementation of the webhook. That is because any non 200 response is treated as a failure to serve the request and the apiserver will see that request as failed and allow the failurepolicy to take place. That ignore or failed will act on this with the intention of it not being reachable and the rejection will abide by that. I contend that this strategy obfuscates the intention behind the failure policy and conflates the non 200 ok failure request as a unreachable request when it was served and left to abide by a failure policy to handle the decision of the logic of the unreachable request. @alvaroaleman @sbueringer @vincepri what do you think? |
The intention behind the failurePolicy is to deal with webhook unavailability or not working for other reasons. If you want a new setting to ignore the decision of the webhook, that is an upream request and unrelated to controller-runtime as troy pointed out in #2662 (comment). The http/200 indicates only if the webhook was able to successfully process the request and doesn't tell you anything about its decision. So yes, if you want to both enable the webhook but it not actually block anything, you will have to write code similiar to what you suggested in #2662 (comment) |
I don't want a new setting to ignore the decision of the webhook, I want a way for the webhook to express it is failing and that it can't make a decision by returning a 500 error. To me that aligns with "webhook unavailability or not working for other reasons." But I understand you guys feel there should be a distinction between failures to get to the webhook and failures that occur in the webhook itself, and that failurePolicy should only apply to the former. If that's settled, then there's nothing to do and we can close this out. The escape hatch we are using is a forked copy of admission.Webhook.ServeHTTP, since it's not much code. |
There is controller-runtime/pkg/webhook/admission/response.go Lines 49 to 60 in f6052ab
|
But that doesn't trigger the apiserver's ignoreClientCallFailures := hook.FailurePolicy != nil && *hook.FailurePolicy == v1.Ignore
rejected := false
if err != nil {
switch err := err.(type) {
case *webhookutil.ErrCallingWebhook:
if !ignoreClientCallFailures {
rejected = true
admissionmetrics.Metrics.ObserveWebhookRejection(ctx, hook.Name, "validating", string(versionedAttr.Attributes.GetOperation()), admissionmetrics.WebhookRejectionCallingWebhookError, int(err.Status.ErrStatus.Code))
}
admissionmetrics.Metrics.ObserveWebhook(ctx, hook.Name, time.Since(t), rejected, versionedAttr.Attributes, "validating", int(err.Status.ErrStatus.Code))
case *webhookutil.ErrWebhookRejection:
rejected = true
admissionmetrics.Metrics.ObserveWebhookRejection(ctx, hook.Name, "validating", string(versionedAttr.Attributes.GetOperation()), admissionmetrics.WebhookRejectionNoError, int(err.Status.ErrStatus.Code))
admissionmetrics.Metrics.ObserveWebhook(ctx, hook.Name, time.Since(t), rejected, versionedAttr.Attributes, "validating", int(err.Status.ErrStatus.Code))
default:
rejected = true
admissionmetrics.Metrics.ObserveWebhookRejection(ctx, hook.Name, "validating", string(versionedAttr.Attributes.GetOperation()), admissionmetrics.WebhookRejectionAPIServerInternalError, 0)
admissionmetrics.Metrics.ObserveWebhook(ctx, hook.Name, time.Since(t), rejected, versionedAttr.Attributes, "validating", 0)
}
} else {
admissionmetrics.Metrics.ObserveWebhook(ctx, hook.Name, time.Since(t), rejected, versionedAttr.Attributes, "validating", 200)
return
} |
Yes, because it isn't. The response code from the webhook indicates if it was able to construct a valid |
If a handler encounters an unrecoverable error and returns:
The ultimate response from the webhook server is a HTTP 200 with the 500 status code to return from the apiserver packaged in the response body.
The problem with this approach is that 500 error the handler attempts to return is not ignored by failurePolicy=Ignore, since the apiserver sees it as a valid rejection response.
When initially rolling out new webhooks we want them to be functional no-ops, in a shadow deny/log-only mode at first. And we start with failurePolicy=Ignore with the expectation that any server 500s will be ignored, which should ensure the initial rollout has no user-facing impact. But since controller-runtime's admission.Webhook implementation always returns a HTTP 200, the apiserver's ignore case is never triggered. Is there a recommended way for handlers to signal they want to return a non-successful HTTP code?
The text was updated successfully, but these errors were encountered: