-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds safeyaml.JSONtoYAML for safer marshaling #139
Conversation
d896997
to
6d1d761
Compare
6d1d761
to
0a4e542
Compare
safeyaml/jsontoyaml.go
Outdated
|
||
// JSONtoYAML converts json data to yaml while preserving order of object fields. | ||
// Invoke yaml.Marshal on the returned interface{}. | ||
func JSONtoYAML(jsonBytes []byte) (interface{}, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to feedback on this API; other options are
func JSONtoYAML(jsonBytes []byte) (yamlBytes []byte, err error) {}
func JSONtoYAML(jsonMarshaler interface{}) (yamlBytes []byte, err error) {}
func JSONtoYAML(jsonMarshaler interface{}) (yamlMarshaler interface{}, err error) {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @bmoylan)
safeyaml/jsontoyaml.go, line 17 at r1 (raw file):
Previously, bmoylan (Brad Moylan) wrote…
Open to feedback on this API; other options are
func JSONtoYAML(jsonBytes []byte) (yamlBytes []byte, err error) {} func JSONtoYAML(jsonMarshaler interface{}) (yamlBytes []byte, err error) {} func JSONtoYAML(jsonMarshaler interface{}) (yamlMarshaler interface{}, err error) {}
My preference is with the first signature/approach -- with that one, it's explicit that JSON bytes are being converted to YAML bytes. It also has the advantage that the use of yaml.v2
becomes an internal implementation detail -- the function could later be modified/altered to use a completely different library (or an internal shaded version of yaml.v2
etc.) and would still behave as described.
It does mean that the function will be making its own decision about decoders/encoders etc., but I think that's a pretty acceptable trade-off (especially if the function is modeled as a transform of []byte
input from JSON to YAML representation)
safeyaml/jsontoyaml.go, line 23 at r1 (raw file):
} var errClosingArrayDelim = fmt.Errorf("unexpected ']' delimiter")
Since these errors are not exported, may make sense to simply construct and return them directly (see other comment)
safeyaml/jsontoyaml.go, line 26 at r1 (raw file):
var errClosingObjectDelim = fmt.Errorf("unexpected '}' delimiter") func tokenizerToYaml(dec *json.Decoder) (interface{}, error) {
Should be tokenizerToYAML
safeyaml/jsontoyaml.go, line 36 at r1 (raw file):
switch v := tok.(type) { case string: return v, nil
Can unify the common cases:
case string, bool, float64:
return v, nil
safeyaml/jsontoyaml.go, line 83 at r1 (raw file):
case ']': return nil, errClosingArrayDelim case '}':
Can unify the cases as:
case ']', '}':
return nil, fmt.Errorf("unexpected '%v' delimiter", v)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @nmiyake)
safeyaml/jsontoyaml.go, line 17 at r1 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
My preference is with the first signature/approach -- with that one, it's explicit that JSON bytes are being converted to YAML bytes. It also has the advantage that the use of
yaml.v2
becomes an internal implementation detail -- the function could later be modified/altered to use a completely different library (or an internal shaded version ofyaml.v2
etc.) and would still behave as described.It does mean that the function will be making its own decision about decoders/encoders etc., but I think that's a pretty acceptable trade-off (especially if the function is modeled as a transform of
[]byte
input from JSON to YAML representation)
I added JSONtoYAMLBytes
but left the existing JSONtoYAML
so it can be used in MarshalYAML implementations.
safeyaml/jsontoyaml.go, line 23 at r1 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
Since these errors are not exported, may make sense to simply construct and return them directly (see other comment)
I made the top-level variables because I check equality on them to break out of the for loops iterating over arrays and objects below
safeyaml/jsontoyaml.go, line 26 at r1 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
Should be
tokenizerToYAML
Done.
safeyaml/jsontoyaml.go, line 36 at r1 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
Can unify the common cases:
case string, bool, float64: return v, nil
Done.
safeyaml/jsontoyaml.go, line 83 at r1 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
Can unify the cases as:
case ']', '}': return nil, fmt.Errorf("unexpected '%v' delimiter", v)
No, we need to know which type of collection we are leaving (hence the two error variables, one for each for loop above)
90778d7
to
8bc7a0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @bmoylan)
safeyaml/jsontoyaml.go, line 17 at r1 (raw file):
Previously, bmoylan (Brad Moylan) wrote…
I added
JSONtoYAMLBytes
but left the existingJSONtoYAML
so it can be used in MarshalYAML implementations.
Would DecodeJSON
(fully qualified as safeyaml.DecodeJSON
) or JSONtoMapSlice
be more precise?
Maybe this is just my particular bias, but when I see "YAML" I generally think of YAML bytes/string, so one of the naming schemes described here would be more intuitive to me. However, don't feel super strongly about this, so also fine to proceed with current naming if you strongly prefer that.
safeyaml/jsontoyaml.go, line 83 at r1 (raw file):
Previously, bmoylan (Brad Moylan) wrote…
No, we need to know which type of collection we are leaving (hence the two error variables, one for each for loop above)
Ah I see, so there are valid cases where the error is returned but the error is just interpreted as control flow instead. Got it.
safeyaml/jsontoyaml.go, line 15 at r2 (raw file):
) // JSONtoYAML converts json data to yaml while preserving order of object fields.
Top-line summary should be updated to more precisely reflect functionality. Something like:
JSONtoYAML decodes JSON bytes into an interface where all objects and maps are decoded as a yaml.MapSlice to preserve key ordering. This can be useful if a type defines custom JSON marshal logic and translating the JSON representation to a YAML representation while preserving key ordering is sufficient for serialization. For example: ...
You don't have to use this exact wording, but do think the current wording (particularly, "converts json data to yaml") should be changed to be more precise ("decoding JSON data using yaml.MapSlice for maps and objects").
safeyaml/jsontoyaml.go, line 30 at r2 (raw file):
dec := json.NewDecoder(bytes.NewReader(jsonBytes)) dec.UseNumber() return tokenizerToYAML(dec)
After doing some local testing, it looks like there's one class of cases that isn't caught here, which is the case where the input contains values after a valid JSON block.
Simplest example is {}{
. The built-in JSON functions will error with invalid character '{' after top-level value
, while the tokenizer-based function will ignore everything after the valid JSON and thus successfully return an empty object (see https://stackoverflow.com/a/25187779 for link to source code where this is checked using scanner).
After playing around locally, I think it's sufficient to check whether dec.More
returns true after successful parsing. Would recommend something like:
val, err := tokenizerToYAML(dec)
if err != nil {
return val, err
}
if dec.More() {
errText := "invalid input after top-level value"
if buffered, err := ioutil.ReadAll(dec.Buffered()); err == nil {
errText += fmt.Sprintf(": %q", string(buffered))
}
return nil, fmt.Errorf(errText)
}
return val, nil
(We could also just skip reading the buffered portion and just say that there's input after the top-level value without specifying what, but I think the information is helpful).
Should also add a test case to test for this.
Based on my testing I don't think the addition of the test for More
breaks anything, but we should verify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @nmiyake)
safeyaml/jsontoyaml.go, line 17 at r1 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
Would
DecodeJSON
(fully qualified assafeyaml.DecodeJSON
) orJSONtoMapSlice
be more precise?Maybe this is just my particular bias, but when I see "YAML" I generally think of YAML bytes/string, so one of the naming schemes described here would be more intuitive to me. However, don't feel super strongly about this, so also fine to proceed with current naming if you strongly prefer that.
Went with safeyaml.JSONtoYAMLMapSlice
safeyaml/jsontoyaml.go, line 15 at r2 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
Top-line summary should be updated to more precisely reflect functionality. Something like:
JSONtoYAML decodes JSON bytes into an interface where all objects and maps are decoded as a yaml.MapSlice to preserve key ordering. This can be useful if a type defines custom JSON marshal logic and translating the JSON representation to a YAML representation while preserving key ordering is sufficient for serialization. For example: ...
You don't have to use this exact wording, but do think the current wording (particularly, "converts json data to yaml") should be changed to be more precise ("decoding JSON data using yaml.MapSlice for maps and objects").
Done.
safeyaml/jsontoyaml.go, line 30 at r2 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
After doing some local testing, it looks like there's one class of cases that isn't caught here, which is the case where the input contains values after a valid JSON block.
Simplest example is
{}{
. The built-in JSON functions will error withinvalid character '{' after top-level value
, while the tokenizer-based function will ignore everything after the valid JSON and thus successfully return an empty object (see https://stackoverflow.com/a/25187779 for link to source code where this is checked using scanner).After playing around locally, I think it's sufficient to check whether
dec.More
returns true after successful parsing. Would recommend something like:val, err := tokenizerToYAML(dec) if err != nil { return val, err } if dec.More() { errText := "invalid input after top-level value" if buffered, err := ioutil.ReadAll(dec.Buffered()); err == nil { errText += fmt.Sprintf(": %q", string(buffered)) } return nil, fmt.Errorf(errText) } return val, nil(We could also just skip reading the buffered portion and just say that there's input after the top-level value without specifying what, but I think the information is helpful).
Should also add a test case to test for this.
Based on my testing I don't think the addition of the test for
More
breaks anything, but we should verify.
Done, though I decided not to include the extra content just to ensure we do not leak anything sensitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Some minor feedback.
Reviewed 2 of 2 files at r3.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @nmiyake and @bmoylan)
safeyaml/jsontoyaml.go, line 30 at r2 (raw file):
Previously, bmoylan (Brad Moylan) wrote…
Done, though I decided not to include the extra content just to ensure we do not leak anything sensitive.
Yup, makes sense
safeyaml/jsontoyaml.go, line 35 at r3 (raw file):
} if dec.More() { return nil, fmt.Errorf("failed to convert json to yaml: invalid input after top-level json value")
Since this is modeled as a library function, would prefer to keep this more compact and analogous to the native output: "invalid character input after top-level value”
Client libraries can add further context (like the fact that the operation is semantically performing a JSON -> YAML conversion) in their error message if they so choose
safeyaml/jsontoyaml_test.go, line 58 at r3 (raw file):
} else { require.Error(t, err) require.Contains(t, err.Error(), test.Err)
Since the content isn't dynamic, can we just assert equality rather than "Contains"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @nmiyake)
safeyaml/jsontoyaml.go, line 35 at r3 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
Since this is modeled as a library function, would prefer to keep this more compact and analogous to the native output: "invalid character input after top-level value”
Client libraries can add further context (like the fact that the operation is semantically performing a JSON -> YAML conversion) in their error message if they so choose
Done.
safeyaml/jsontoyaml_test.go, line 58 at r3 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
Since the content isn't dynamic, can we just assert equality rather than "Contains"?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @bmoylan)
safeyaml/jsontoyaml.go, line 35 at r4 (raw file):
} if dec.More() { return nil, fmt.Errorf("invalid input after top-level json value")
Capitalize "JSON"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @nmiyake)
safeyaml/jsontoyaml.go, line 35 at r4 (raw file):
Previously, nmiyake (Nick Miyake) wrote…
Capitalize "JSON"
Done.
Reviewable isn't allowing submission of review, but lgtm -- thanks! |
This change is