-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict status codes raised by the library #595
Comments
Connect returns I'm very reluctantly okay with changing our use of |
Clarifying the server errors codes should help users debug errors. Any |
Mangled payloads can just as easily come from the client's RPC runtime. I've never seen a REST server treat a mangled JSON request payload as a 5xx - it seems pretty standard to me to assume that mangled payloads are the client's fault (or some middlebox's fault). |
Anything but a 400 is unexpected in my book. The gRPC protocol does not use HTTP status codes and is rarely used with JSON. Are we sure we want to forego semantic HTTP status codes for the Connect implementation because of gRPC? It would be really nice if there was a set of error codes reserved for users. It would help with debugging and it would let a user-facing application decide which error messages to show to the user verbatim. I don't think that restricting codes raised by the library gets us there though: It leaves just 7 codes that are guaranteed to originate from the server application, and it doesn't give any guarantees about the inverse (since a server application can raise any code). So I wonder if what gRPC does is actually very helpful in practice. |
Ideally it would be a 400 HTTP status but not If both client and server generate the request a 500 seems reasonable to me as this is a issue with the client/server lib, but yep this isn't RESTful. |
@jhump and @mattrobenolt Curious about your thoughts on this one. |
To be explicit, if this is referring specifically to the codes we spit back for the gRPC protocol, and not the connect protocol, I think we should mirror the behavior of grpc-go and what the spec says. I haven't gone through and audited each use of them and compared to what the spec says, but I think it'd make sense to try and be as compatible to expectations. Granted, the only real downside I see is this might be considered an API breaking change? But I don't feel that strongly about it. As a quick run through our/my uses, we don't do any checks that particularly compare the status codes. Anything not-OK is usually treated the same, and typically it's either pass/fail, with the code and status as just effectively metadata. |
I agree with Timo, that anything other than a 400 is bad for the Connect protocol. But, TBH, we should perhaps make an appeal to the gRPC team to change the spec, because using "internal" for gRPC error codes is also bad for the same reason: it makes observability and alerting a nightmare because you can no longer correctly distinguish between "errors caused by a bad client" and "errors caused by my server". I've never worked at a place where dev ops didn't distinguish these cases for alerting and exception-reporting config. So having a client-induced error reported as "internal" is bad. With HTTP/REST, you'd generally distinguish by classifying status codes as 4xx vs 5xx. With gRPC, the best you can do is to statically classify each gRPC error code as either "client" or "server", and this breaks down when a code like "internal" can mean either. |
Interestingly gRPC clients that receive a 400 response from a service map to OTEL recently adjusted the gRPC status codes to map closer to HTTP semantics: open-telemetry/opentelemetry-specification#3333
|
This is fixed with the conformance tests and the spec update for error codes. |
Connect-go raises error codes that fall outside what the gRPC spec recommends for a framework: https://github.com/grpc/grpc/blob/master/doc/statuscodes.md
INVALID_ARGUMENT
andFAILED_PRECONDITION
from a quick grep. I think most of these cases should be changed to Internal errors.The text was updated successfully, but these errors were encountered: