-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
503 and 403 errors when using more than 1 ambassador pod #1461
Comments
I am still getting intermittedt 503 even when one replica is there. |
This is happening for 10-20% of the requests |
Facing similar issue on the latest version of Ambassador (0.60.2) |
Also seeing this issue with |
We were encountering this issue version We are seeing intermittent Adding a Is the retry configuration applied to the external auth call? If not, is there a way to configure that? |
Does anybody can confirm that retry_policy works for AuthService? The changes were merged but actually looked to issue in envoy repo - envoyproxy/envoy#5974 that was closed without adding support of retry for envoy.ext_authz filter. |
I believe we already have taken this patch in our version of Envoy |
@richarddli If I find correctly configuration definition for ext_authz filter: https://github.com/datawire/ambassador/blob/master/go/apis/envoy/config/filter/http/ext_authz/v2/ext_authz.pb.go#L72 |
Hi @richarddli, did you have a chance to recheck that retry policy should have worked for AuthService configuration in latest releases of Ambassador? If we define retry per Mapping or Globally then it works fine but when define only for AuthService then it seems doesn't work. |
Hi, any closure on this? I'd say we're experiencing the same behavior with just a single replica of plain envoy with authorization service and a some endpoint backend. Around 20% of the requests are 403 or 503 when we generate higher load. Note that the failed requests are not received by the target component, 403 - authorization service and 503 - the backend, at all. |
We are seeing this issue with Ambassador
@richarddli can you please comment on this? |
For us it was wrong kubernetes configuration. Our pods didn't have enough connections enabled in sysctl. |
We are using ambassador:0.75.0 with 3 replicas , getting similar issue where we get intermittent failures while hitting authorisation service HTTP/1.1" 403 UAEX 0 0 5002 - connections are getting closed ~5sec. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
i're experiencing this on version |
I'm experiencing this aswell running in AWS when the concurrency is high enough. Firing 200 requests (5 concurrent) usually makes atleast one fail with response flag UAEX or UC. When UAEX is raised we can't find any traces in our AuthService. When UC is raised we can find a trace, but the trace says everything is okay, and 200 is returned. To me it seems like UAEX is raised when the connection is closed before the AuthService is reached, while UC is raised when the connection is closed before the AuthService has responded. We're running version 1.0 |
Same issue here - 403s or 503s with UC and UAEX codes in |
Last version is 1.1.1. Maybe you should try to upgrade first ? |
Haven't had the time to try it yet, but I'm confident that the new setting https://www.getambassador.io/reference/core/ambassador/#upstream-idle-timeout-cluster_idle_timeout_ms in version 1.1.1 might solve the issues for me. |
update: I have managed to rectify this issue in my cluster still running on version For my case, I found that the bottleneck was on my external |
Did anyone manage to solve this issue? Or did you try to apply any workarounds to mitigate the number of errors between Ambassador and AuthService? @sekaninat Did you remember what you changed and what values you had before? |
@richarddli Shouldn't this issue be reopened? I'm also seeing this issue unfortunately :( |
You can believe it or not. |
In our case some of the errors went away by increasing the CPU limits for ambassador deployment but the problem still remains. |
Did anyone find an official solution for that issue? |
I am also getting this issue. Any solutions or workarounds yet? |
Did we get any resolution on this. We are also getting this error with emissary-ingress 2.1 and tried with 2 pods |
In our case, it is an issue with our design. We have two different instances of the application running each having its own authentication service. When requests hits ingress it sends the request for authentication in a round robin fashion. When a request to a targeted application hits the corresponding auth service it was working but when it hits the other auth service it was failing. We changed our design and have single authentication service now and it is working fine. Hope this helps if someone encounters similar problem |
I think this should not be a 403 and be a 5xx instead. Irrespective In our case it was clearly a CPU related bottleneck on the ambassador pods. Adding extra pods and balancing our workload eliminated the 403's. |
Describe the bug
When talking to a service through ambassador that has an auth service, was getting what appeared to be random responses between 200, 503, and 403. Had replica of 3 set for the ambassador deployment. Upon looking at the logs, one pod was always giving 200 responses, another 503, and another 403. Tried restarting the trouble pods with no luck.
As a workaround I made my deployment to just replica 1 and now only seem to be getting 200 responses. Only saw this bug when I upgraded to 0.53.1. Was previously on version 0.50.3.
Versions (please complete the following information):
The text was updated successfully, but these errors were encountered: