-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternate AIP-193 recommendation #45
base: main
Are you sure you want to change the base?
Conversation
This adds a generic AIP for errors. There is probably a decent bit for us to discuss here. Some high-level notes: - I decided to represent expected JSON interfaces using TypeScript (rather than JSONSchema), which I perceive to be much better for human readability. - We should discuss/debate the proposed Error interface. It is mostly similar to what Google uses but with two fields "promoted" (we have a huge _mea culpa_ here). - I did not discuss any common _headers_ related to error handling (e.g. `Retry-After`). I personally think that `Retry-After` gets covered in AIP-194, and I could not think of any others that warranted inclusion here. I expect this is an area where everyone will need to make changes, but I also notice that entire sections can probably be adopted by everyone (e.g. "Messages"), so I think this should work reasonably well. Looking forward to the discussion on this.
aip/general/0193/aip.md
Outdated
```typescript | ||
interface Error { | ||
//A machine-readable code indicating the type of error. This value is parseable for programmatic error handling. | ||
code?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code?: string
instead of type?: string, format: uri-reference
- There's a need to return a machine-readable error code beyond HTTP status. While technically the type parameter returns a unique URI reference for a given problem, comparing URI references programmatically doesn't seem an ideal experience.
- Now any string would qualify as a uri-reference, it’s confusing to recommend something as a uri-reference, if we think most API providers should just return error codes.
- To ensure we’re compliant with RFC 7807, an alternative name that’s used frequently by several APIs could be “code” instead of “type”.
- Example usage: Apple, Stripe, Github, Microsoft, Facebook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternate option here could use type?: string
. Though, I'm not sure if that breaks compliance with RFC 7807
aip/general/0193/aip.md
Outdated
detail?: string | ||
|
||
//A unique identifier that identifies the specific occurrence of the problem. Can be provided to the API owner for debugging purposes. | ||
id?: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id?: string
instead of instance?: string, format: uri-reference
- More on “instance”:
- When the "instance" URI is dereferenceable, the problem details object can be fetched from it. It might also return information about the problem occurrence in other formats through use of proactive content negotiation (see [HTTP], Section 12.5.1).
- When the "instance" URI is not dereferencable, it serves as a unique identifier for the problem occurrence that may be of significance to the server, but is opaque to the client.
- Practical use cases of returning additional problem details for a specific occurrence through a URL seem limited.
- To represent a unique identifier for an error, while being compliant with RFC 7807, an alternative name could be “id” (Example usage: Apple, Facebook)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternate options:
errorInstanceId
traceId
requestId
occurrenceId
Consensus was to use occurrenceId
. id
, errorId
could be confused as an id unique for this error type
instead of the specific occurrence. `
aip/general/0193/aip.md
Outdated
|
||
Services **must** clearly distinguish successful responses from error responses | ||
by using appropriate HTTP codes: | ||
- Informational responses **must** use HTTP status codes between 100 and 199. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing 100 to 199 status code is per RFC 7807
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure what an "informational response" is - and I'd rather not have to go to the HTTP RFCs to find out. Does this range indicate success or failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These may not indicate success or failure. These are often used to indicate intermediate status of an HTTP request. For instance, a server that supports HTTP/2 can respond with 101 Switching Protocols
upgrading a HTTP/1.1 connection to a HTTP/2 connection: https://datatracker.ietf.org/doc/html/rfc7540#section-3.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding clarification
aip/general/0193/aip.md
Outdated
```typescript | ||
interface Error { | ||
//A machine-readable code indicating the type of error (like `name_too_long`). This value is parseable for programmatic error handling. | ||
code: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code: string
instead of type?: string, format: uri-reference
- There's a need to return a machine-readable error code beyond HTTP status. While technically the type parameter returns a unique URI reference for a given problem, comparing URI references programmatically doesn't seem an ideal experience.
- Now any string would qualify as a uri-reference, it’s confusing to recommend something as a uri-reference, if we think most API providers should just return error codes.
- To ensure we’re compliant with RFC 7807, an alternative name that’s used frequently by several APIs could be “code” instead of “type”.
- This one field could be set as required.
- Example usage: Apple, Stripe, Github, Microsoft, Facebook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably specify that strings should be comparable using ordinal comparisons - not case-insensitively, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding clarification below. One thing to note is the recommendation is to use only lowercase letters here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per discussion, consensus was to to stick with type?: string
without format: uri-reference
because comparing URI references programmatically isn't an ideal experience. @dret we were wondering if there has been a discussion around dropping format: uri-reference
from RFC 7807?
necessary, the service **should** provide a link where a reader can get more | ||
information or ask questions to help resolve the issue. | ||
|
||
Below are some examples of good errors and not so good errors: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on including some example good and not so good errors here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing Google AIP-193 talks about ErrorInfo, which includes a string-to-string map for additional information, like the extensions in RFC 7807. I think we should have some way of passing additional error-specific-but-machine-readable information.
aip/general/0193/aip.md
Outdated
|
||
Services **must** clearly distinguish successful responses from error responses | ||
by using appropriate HTTP codes: | ||
- Informational responses **must** use HTTP status codes between 100 and 199. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure what an "informational response" is - and I'd rather not have to go to the HTTP RFCs to find out. Does this range indicate success or failure?
aip/general/0193/aip.md
Outdated
Services **must** clearly distinguish successful responses from error responses | ||
by using appropriate HTTP codes: | ||
- Informational responses **must** use HTTP status codes between 100 and 199. | ||
- Successful responses **must** use HTTP status codes between 200 and 399. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are 3xx responses really successful in the context of APIs? 304 could be considered successful, but anything else really suggests further action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. 3XX should mean redirection. Updating this
aip/general/0193/aip.md
Outdated
```typescript | ||
interface Error { | ||
//A machine-readable code indicating the type of error (like `name_too_long`). This value is parseable for programmatic error handling. | ||
code: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably specify that strings should be comparable using ordinal comparisons - not case-insensitively, for example.
aip/general/0193/aip.md
Outdated
|
||
```typescript | ||
interface Error { | ||
//A machine-readable code indicating the type of error (like `name_too_long`). This value is parseable for programmatic error handling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super-nit: //
without a space after it makes me wince :)
On 2021-12-07 16:20, Jon Skeet wrote:
***@***.**** commented on this pull request.
The existing Google AIP-193 talks about ErrorInfo, which includes a
string-to-string map for additional information, like the extensions in
RFC 7807. I think we should have /some/ way of passing additional
error-specific-but-machine-readable information.
that's a good idea. it folds nicely into my proposal to be a more
structured with the AIPs by always having why/what/how sections. the
"there should be /some/ way of passing additional
error-specific-but-machine-readable information" is perfect for the
"what" part, and ErrorInfo and RFC 7807 (maybe plus patterns for how to
use the type identifier) are two very useful "how" parts.
…--
erik wilde | ***@***.*** |
| http://dret.net/netdret |
| http://twitter.com/dret |
|
Another place to look that might add some insight here, https://jsonapi.org/format/#errors |
aip/general/0193/aip.md
Outdated
// A unique identifier that identifies the specific occurrence of the problem. Can be provided to the API owner for debugging purposes. | ||
id?: string | ||
|
||
// An array of additional error details. errors?: any[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- a map is recommended
- any json value would be recommended
metadata?: object
aip/general/0193/aip.md
Outdated
|
||
Services **must** clearly distinguish successful responses from error responses | ||
by using appropriate HTTP codes: | ||
- Informational responses **must** use HTTP status codes between 100 and 199. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These may not indicate success or failure. These are often used to indicate intermediate status of an HTTP request. For instance, a server that supports HTTP/2 can respond with 101 Switching Protocols
upgrading a HTTP/1.1 connection to a HTTP/2 connection: https://datatracker.ietf.org/doc/html/rfc7540#section-3.2
@@ -0,0 +1,7 @@ | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to change anything in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like google.aip.dev moves this front-matter to the file itself (which I tend to agree with)
https://raw.githubusercontent.com/aip-dev/google.aip.dev/master/aip/general/0001.md.
Is there an outstanding issue to move this to the more succinct format?
aip/general/0193/aip.md
Outdated
only lower-case letters, numbers, and the `-` character. These strings should | ||
be comparable using ordinal comparisons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean? What kind of "ordinal comparisons", and for what purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added "ordinal comparison" here per @jskeet's feedback: #45 (comment)
What that would mean here is each a developer should send stable error codes that can always be compared to the exact same numeric character values. For example, an API shouldn't send invalid_auth
sometimes and Invalid_Auth
other times.
That said, given we're recommending developers to always send lower-case letters here. I'm curious if we need that anymore? @jskeet thoughts?
```typescript | ||
interface Error { | ||
// A machine-readable code indicating the type of error (like `name_too_long`). This value is parseable for programmatic error handling. | ||
type: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-posting this comment here because the last one is now buried under the PR updates.
Per discussion, the consensus was to stick with type?: string
without format: uri-reference
because comparing URI references programmatically isn't an ideal experience. Also, non-resolvable URI references are not valuable. @dret we were wondering what do you think about dropping format: uri-reference
from RFC 7807?
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
occurenceId?: string | ||
|
||
// A map of metadata returning additional error details that can be used programmatically | ||
metadata?: dict<string, any> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The schema of metadata should be documented and a change in this schema could mean a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shwoodard: The schema of metadata should be fixed per type
and a change in this schema could mean a breaking change.
aip/general/0193/aip.md
Outdated
detail?: string | ||
|
||
// A unique identifier that identifies the specific occurrence of the problem. Can be provided to the API owner for debugging purposes. | ||
occurenceId?: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the feedback in hangout chat was occurenceID
isn't elegant. Do folks have other suggestions here? We previously rejected names like id
because we thought they may seem to have 1 to 1 mapping with type
.
We considered traceId
but were unsure if that may have a different meaning than what trace_id
may mean with open telemetry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shwoodard incidentID
as alternative
@lukesneeringer incidentId
sounds 👍
occurenceId?: string | ||
|
||
// A map of metadata returning additional error details that can be used programmatically | ||
metadata?: dict<string, any> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shwoodard: The schema of metadata should be fixed per type
and a change in this schema could mean a breaking change.
aip/general/0193/aip.md
Outdated
status?: integer | ||
|
||
// A human-readable explanation specific to this occurrence of the problem | ||
detail?: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shwoodard @lukesneeringer message
is more standard name for this field. Though RFC 7807 has this as detail
.
@lukesneeringer @shwoodard we should drop detail
from the recommendation because adding explanation specific to an occurrence of the problem may lead to developers parsing this string.
aip/general/0193/aip.md
Outdated
detail?: string | ||
|
||
// A unique identifier that identifies the specific occurrence of the problem. Can be provided to the API owner for debugging purposes. | ||
occurenceId?: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shwoodard incidentID
as alternative
@lukesneeringer incidentId
sounds 👍
@saurabhsahni would it be possible to update the PR comment that has |
@shwoodard I updated PR to reflect the new structure. |
|
||
```typescript | ||
interface Error { | ||
// A machine-readable code indicating the type of error (like `name_too_long`). This value is parseable for programmatic error handling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify that this "Should not change from occurrence to occurrence"
aip/general/0193/aip.md
Outdated
// A machine-readable code indicating the type of error (like `name_too_long`). This value is parseable for programmatic error handling. | ||
type: string; | ||
|
||
// A human readable description of the problem. Should not change from occurrence to occurrence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A human-readable description of the problem. Messages are likely to be logged in plain text and should not include information about a specific occurrence. Information about specific occurrences should be part of metadata
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello! I was requested by @makahmad to leave a review.
I think there's already a lot of outstanding issues as-is, so I'm wondering how those will be resolved.
Honestly with a year since the last activity, it's probably worth re-discussing this in the AIP meetings to get some re-review here of the high level approach.
|
||
Error responses **should** conform to the following interface: | ||
|
||
```typescript |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why typescript when the existing aip.dev is using protobuf / OpenAPI? https://aip-dev.github.io/aip.dev/136.
errors, and to reduce boilerplate by having common error-handling logic, rather | ||
than being expected to constantly add verbose error handling everywhere. | ||
|
||
## Guidance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since there already is protobuf-level guidance, why not include protobuf? or does it translate? https://google.aip.dev/193.
I'm thinking about how to reconcile this proposal with existing practices in google.aip.dev (which, as the only public repository up until now, has served as the base of other forks IIUC).
Overview
This PR presents an alternate AIP-193 Error response structure recommendation as below
Error structure
Background
Here's the difference between RFC 7807 and what we've in original AIP-193:
(between 200 and 599)
(between 100 and 599)
This is a constant value that identified the cause of the error, unique within a given domain, that developers write code against.
A URI reference RFC3986 that identifies the problem type (doesn’t need to resolve)
It should not change from occurrence to occurrence of the problem, except for purposes of localization.
usually the registered service address of the tool or product that generates the error
An array of additional error details.
A human-readable explanation specific to this occurrence of the problem
Tweaks from RFC 7807 in detail
detail
key (A human-readable explanation specific to this occurrence of the problem) in the top-level error structure. If developers only include occurrence-specific information likeYour current balance is 30, but that costs 50
as a string message, developers may turn to parse these error strings to dive into the problem.title
fieldmessage
because it’s a more standard name to describe the error like Exceptions have messages.RFC 7807 recommends
Content-Type
to be set asapplication/problem+json
. However, it's unclear to me how much value a unique content type would add for the error scenario? Should clients rather refer to HTTP status code in the header to figure out if there was an error or not? Do we need to provide recommendation forContent-Type
as part of this AIP?In lieu of allowing additional properties, we would like to add a new extensible property called
metadata
that can be used programmatically. This will take following structure:metadata?: dict<string, any>