encoding/protojson: opt out of whitespace randomization #1082

kevinburke1 · 2020-04-13T20:13:06Z

Commit protocolbuffers/protobuf-go@582ab3d added randomization to the multi-line json output.

I understand there are good reasons for doing this. I hope you will understand I also have good reasons for not doing this. Among others, generating a protobuf with fake data, re-rendering it on every test run is one of the best ways to detect drift in the configuration, accidental or purposeful.

I can handle the output changing across multiple versions of the protobuf library. It's much easier for me to sync the library version than it is to sync Go versions. If there are breaking changes that generate a diff, I can manage them all at once, instead of

Is there any way to opt out? A UnsafeDisableRandomization flag, or something. Right now I am probably going to make a second pass with a marshaler that does not generate random whitespace.

The text was updated successfully, but these errors were encountered:

dsnet · 2020-04-13T20:21:18Z

The reason why we were unable to make improvements to both text and JSON was because of the many hard-coded tests that expected the output to be byte-for-byte identical. We're not going to repeat the mistake again with the new prototext and protojson implementations.

I understand that stable output has benefits, but unless there is a canonical definition (which there isn't) for how to serialize protobufs in text and JSON, we're not going to pretend like one exists.

If you want a frozen output, then use github.com/golang/jsonpb. The output will be stable, but will also continue to have buggy behavior in some cases and it won't received any improvements. If you want a decent amount of stability, then pass the output through a formatter. For example, just parsing the JSON with encoding/json and formatting it again should provide some degree of stability.

kevinburke1 · 2020-04-13T20:25:53Z

This is what I'm doing, for the moment

	marshaler = &protojson.MarshalOptions{
		// no point turning this on if we're reformatting immediately
		// Multiline:       true,
		EmitUnpopulated: true,
		Indent:          "",
		UseProtoNames:   true,
	}
	data, err := marshaler.Marshal(cfg)
	if err != nil {
		return err
	}
	var rm json.RawMessage = data
	data2, err := json.MarshalIndent(rm, "", " ")
	if err != nil {
		return err
	}
	data2 = append(data2, '\n')
	return w.Write(data2)

dsnet · 2020-04-13T20:26:55Z

That seems like a reasonable workaround?

dsnet · 2020-05-01T19:30:44Z

Closing as we're not going to be changing the randomization. Analysis of all tests at Google (we have a lot) for text and JSON usages indicates that 89.5% do not care about the exact output. Of the remaining, 10.4% were brittle tests that really should have been written to perform a semantic comparison on the structured messages (rather than a byte-for-byte comparison on the serialized output). The remaining 0.1% were cases like generated configurations and/or code where a degree of stability is actually needed. For that 0.1%, the workaround is to pass the output to a JSON formatter or text formatter to produce more stable output.

If randomization pushes the 10.4% to properly perform semantic comparisons, then I'll be really happy. Since a workaround exists for the 0.1% with a legitimate need for some degree of stability, I don't think a opt-out is justified.

brandonchinn178 · 2024-04-19T22:41:31Z

I don't think a opt-out is justified.

It's slightly amusing that your own tests disable randomization, and yet you don't want to expose that to users 😅

https://github.com/protocolbuffers/protobuf-go/blob/c2a26e757ed57746693fca53de0b6136a8e46b74/encoding/protojson/encode_test.go#L32-L33

dsnet · 2024-04-19T22:54:10Z

Those frustrated by the randomization should be focusing on the development of a stable output format: #1121

puellanivis · 2024-04-20T21:01:11Z

I don't think a opt-out is justified.

It's slightly amusing that your own tests disable randomization, and yet you don't want to expose that to users 😅

The protobuf library is subject to the contract of the standardization. However, certain things (e.g. ordering of fields in a message) are undefined in the contract, but this library is still an authority on what its own self should be producing. Knowing how we have chosen those implementation details, we can generate golden tests dependent upon our own implementation details. We remain, after all, the authority on what we expect our code to generate, even if the contract is silent on certain details. We’re allowed to make stronger assumptions about the output than the standard provides. After all, if we change our output, we can also change our tests at the same time, and then the deltas of golden examples even documents how our output has changed.

Meanwhile, users of this package are not in that same situation. If this package changed the ordering of fields, it would break anyone else using golden test, even though the two variations are semantically identical. Their tests would break, even though the output itself remains valid and within the standard. So their golden tests fail, even though their code has not broken. It’s generally bad form to break people’s tests for non-functional issues.

Trying to keep Hyrum’s Law at bay is part of us looking forward to a potentially future formal clarification of deterministic output. We should hope that even if that standard requires us to make drastic changes, that no one using this package now should break tests one we implement it.

dsnet changed the title ~~protojson: opt out of whitespace randomization~~ encoding/protojson: opt out of whitespace randomization Apr 13, 2020

dsnet closed this as completed May 1, 2020

shifucun mentioned this issue Oct 8, 2020

Fix random golden json issue datacommonsorg/mixer#330

Merged

mikecutalo mentioned this issue Mar 4, 2022

shortlink: fix test json assert lyft/clutch#2086

Merged

vroldanbet mentioned this issue May 19, 2023

Add support for cursors and limits on LookupResources API calls authzed/spicedb#1296

Merged

d-bytebase mentioned this issue Oct 15, 2023

chore: upgrade mysql-parser dependency bytebase/bytebase#8728

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

encoding/protojson: opt out of whitespace randomization #1082

encoding/protojson: opt out of whitespace randomization #1082

kevinburke1 commented Apr 13, 2020

dsnet commented Apr 13, 2020

kevinburke1 commented Apr 13, 2020 •

edited

Loading

dsnet commented Apr 13, 2020 •

edited

Loading

dsnet commented May 1, 2020

brandonchinn178 commented Apr 19, 2024 •

edited

Loading

dsnet commented Apr 19, 2024

puellanivis commented Apr 20, 2024

encoding/protojson: opt out of whitespace randomization #1082

encoding/protojson: opt out of whitespace randomization #1082

Comments

kevinburke1 commented Apr 13, 2020

dsnet commented Apr 13, 2020

kevinburke1 commented Apr 13, 2020 • edited Loading

dsnet commented Apr 13, 2020 • edited Loading

dsnet commented May 1, 2020

brandonchinn178 commented Apr 19, 2024 • edited Loading

dsnet commented Apr 19, 2024

puellanivis commented Apr 20, 2024

kevinburke1 commented Apr 13, 2020 •

edited

Loading

dsnet commented Apr 13, 2020 •

edited

Loading

brandonchinn178 commented Apr 19, 2024 •

edited

Loading