Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error signaling in protojson.Unmarshal() function #1412

Closed
rkintzi opened this issue Feb 9, 2022 · 9 comments
Closed

Better error signaling in protojson.Unmarshal() function #1412

rkintzi opened this issue Feb 9, 2022 · 9 comments

Comments

@rkintzi
Copy link

rkintzi commented Feb 9, 2022

Is your feature request related to a problem? Please describe.

Errors returned by protojson.Unmarshal() are undocumented and hard (if not impossible) to parse or handle.

Context: I need to wrap GRPC services/methods with REST endpoints. Part of this is translating JSON structs to protobuf messages. protojson.Unmarshal() function seems an obvious choice to have this done. Unfortunately errors returned from this function can not be easily translated to something meaningful for the client.

Example errors returned from Unmarshal() function:

  1. proto: (line 1:41): invalid value for enum type: "bad value"
  2. proto: syntax error (line 1:1): unexpected token [ - when json list is parsed into a message
  3. proto: syntax error (line 1:1): unexpected token - when empty slice of bytes is passed as input.

All above errors are returned as values of type *errors.prefixError. And because this is a private type I can not use errors.As() to handle different types of errors in different ways (although, I don't even know if protojson.Unmarshal() returns other error types; there is nothing about errors in the package documentation).

Describe the solution you'd like

  1. I suggest introducing new error types to signal different kinds of parsing errors. For example errors mentioned in example 1 above could be returned as BadValueError struct:
type BadValueError struct {
    FieldName string // name of a message field (maybe FieldPath to represent embedded messages)
    Value interface{} // a value that can not be passed into a field
}
  1. Errors like in examples 2, 3 could be represented SyntaxError:
type SyntaxError string
@dsnet
Copy link
Member

dsnet commented Feb 9, 2022

I recommend against protojson doing anything here to have more distinguishable errors as that would further lock down the JSON implementation.

Ideally, the protojson package would use the standard json package under the hood. However, it doesn't do so because the JSON flavor that protobuf uses is RFC 7493, which the standard library is not (yet) compatible with. There's movement to have the standard library be compatible with it, and when that happens, protojson should probably be switched to use that implementation. If that happens, it will break any errors that the package promises to return.

@rkintzi
Copy link
Author

rkintzi commented Feb 10, 2022

OK, fair enough. Could you confirm, that every error returned from Unmarshal indicate bad user input (so it can be forwarded to the client app as 400 bad request, and not 500)? I scanned through source code in decoder.go file (and also it is hard to imagine other kind of errors) and it seems like the case, but maybe I missed something.

Also, there is no need to fix error types to provide meaningful information. Package can provide helper functions that take generic errors and return information embedded in it. It should be possible to implement these helpers based on concrete errors returned from most parsers (they all do the same thing; and what standard json parser returns is enough for my purposes). Such functions could be marked deprecated once concrete errors are part of the interface.

But maybe it's not worth changing the current implementation, since a new one is already planned? On the other hand, it will take at least half a year, before the enhanced json package hits production Go release (not sure, but I don't see that works on your proposal has already started) .

@rkintzi
Copy link
Author

rkintzi commented Feb 10, 2022

One more thing. Should I close the issue?

@dsnet
Copy link
Member

dsnet commented Feb 10, 2022

On the other hand, it will take at least half a year, before the enhanced json package hits production Go release

Probably true. There's a new json package that's been in the works. We're waiting for Go 1.18 to be released before starting the discussion for proposing how it would be added to the standard library (whether as a separate v2 json package or to integrate it's functionality into the existing json package.)

One more thing. Should I close the issue?

That's a question for the current maintainers of Go protobuf. I no longer work on this project.

@cybrcodr
Copy link
Contributor

I'm closing this issue as there's no action needed on our end till a v2 json package is released and evaluated if it'll be a fit here which can be a separate issue.

@rkintzi
Copy link
Author

rkintzi commented Feb 14, 2022

Thanks for your support. One more thing. I notice that output from Marshal function is documented as unstable (I even notice that this is enforce in the code). The question is how it is supposed to change in the future. I hope this only applies to the JSON formatting (white spaces, indentations, etc) and not for example to how fields names and enums values are encoded. I don't want to end up with unstable REST API for my grpc services.

@dsnet
Copy link
Member

dsnet commented Feb 14, 2022

The unstable output is there so that we can change the JSON output to something that is semantically equivalent but might have superficial changes to the output and avoid breaking anyone who happens to be depending on the output being byte-for-byte identical for all time.

Now, stable output is still a useful property, but it should be a feature that we opt into, rather than on by default. See #1121.

@puellanivis
Copy link
Collaborator

I hope this only applies to the JSON formatting (white spaces, indentations, etc) and not for example to how fields names and enums values are encoded. I don't want to end up with unstable REST API for my grpc services.

As noted, the changes are only in semantically equivalent situations. The areas where you express the most concern (field names and enum encoding) are specified by the JSON-mapping for proto3, and thus have been stabilized.

@dsnet
Copy link
Member

dsnet commented Feb 14, 2022

for example to how fields names and enums values are encoded.

The field names will not changes, but it's possible that the order could change. Ordering of JSON objects have no semantic meaning according to RFC 8259 (per section 4).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants