-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding/protojson: opt out of whitespace randomization #1082
Comments
The reason why we were unable to make improvements to both text and JSON was because of the many hard-coded tests that expected the output to be byte-for-byte identical. We're not going to repeat the mistake again with the new I understand that stable output has benefits, but unless there is a canonical definition (which there isn't) for how to serialize protobufs in text and JSON, we're not going to pretend like one exists. If you want a frozen output, then use |
This is what I'm doing, for the moment marshaler = &protojson.MarshalOptions{
// no point turning this on if we're reformatting immediately
// Multiline: true,
EmitUnpopulated: true,
Indent: "",
UseProtoNames: true,
}
data, err := marshaler.Marshal(cfg)
if err != nil {
return err
}
var rm json.RawMessage = data
data2, err := json.MarshalIndent(rm, "", " ")
if err != nil {
return err
}
data2 = append(data2, '\n')
return w.Write(data2) |
That seems like a reasonable workaround? |
Closing as we're not going to be changing the randomization. Analysis of all tests at Google (we have a lot) for text and JSON usages indicates that 89.5% do not care about the exact output. Of the remaining, 10.4% were brittle tests that really should have been written to perform a semantic comparison on the structured messages (rather than a byte-for-byte comparison on the serialized output). The remaining 0.1% were cases like generated configurations and/or code where a degree of stability is actually needed. For that 0.1%, the workaround is to pass the output to a JSON formatter or text formatter to produce more stable output. If randomization pushes the 10.4% to properly perform semantic comparisons, then I'll be really happy. Since a workaround exists for the 0.1% with a legitimate need for some degree of stability, I don't think a opt-out is justified. |
It's slightly amusing that your own tests disable randomization, and yet you don't want to expose that to users 😅 |
Those frustrated by the randomization should be focusing on the development of a stable output format: #1121 |
The protobuf library is subject to the contract of the standardization. However, certain things (e.g. ordering of fields in a message) are undefined in the contract, but this library is still an authority on what its own self should be producing. Knowing how we have chosen those implementation details, we can generate golden tests dependent upon our own implementation details. We remain, after all, the authority on what we expect our code to generate, even if the contract is silent on certain details. We’re allowed to make stronger assumptions about the output than the standard provides. After all, if we change our output, we can also change our tests at the same time, and then the deltas of golden examples even documents how our output has changed. Meanwhile, users of this package are not in that same situation. If this package changed the ordering of fields, it would break anyone else using golden test, even though the two variations are semantically identical. Their tests would break, even though the output itself remains valid and within the standard. So their golden tests fail, even though their code has not broken. It’s generally bad form to break people’s tests for non-functional issues. Trying to keep Hyrum’s Law at bay is part of us looking forward to a potentially future formal clarification of deterministic output. We should hope that even if that standard requires us to make drastic changes, that no one using this package now should break tests one we implement it. |
Commit protocolbuffers/protobuf-go@582ab3d added randomization to the multi-line json output.
I understand there are good reasons for doing this. I hope you will understand I also have good reasons for not doing this. Among others, generating a protobuf with fake data, re-rendering it on every test run is one of the best ways to detect drift in the configuration, accidental or purposeful.
I can handle the output changing across multiple versions of the protobuf library. It's much easier for me to sync the library version than it is to sync Go versions. If there are breaking changes that generate a diff, I can manage them all at once, instead of
Is there any way to opt out? A
UnsafeDisableRandomization
flag, or something. Right now I am probably going to make a second pass with a marshaler that does not generate random whitespace.The text was updated successfully, but these errors were encountered: