Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codegen: unknown fields should be retained #93

Open
scottlamb opened this issue Feb 12, 2018 · 3 comments
Open

Codegen: unknown fields should be retained #93

scottlamb opened this issue Feb 12, 2018 · 3 comments

Comments

@scottlamb
Copy link

If I parse a message with unknown fields and then serialize it, I'd like them to be written back out. This is a big safety improvement; it means that if there are still older clients accessing a shared database, they don't break things they don't understand.

This matches the behavior of the official C++ implementation for both proto2 and proto3. (proto3 originally dropped support for this, but it was added back after a huge backlash.)

@tafia
Copy link
Owner

tafia commented Mar 1, 2018

Yes, like #92 this should be implemented with a dedicated flag on pb-rs.

@imoverclocked
Copy link

Maybe we should optionally retain them since proto3 is "in the wild" with this behavior now? It's possible some people are relying on not having extra fields once the local proto3 definition removes the field.

Use-case: someone imports com.googlecode.protobuf.format.*Format and uses it to output json/xml/etc. Adding unknown fields back into proto3 in a new version now pollutes the output and potentially breaks known schemas.

@scottlamb
Copy link
Author

Use-case: someone imports com.googlecode.protobuf.format.*Format and uses it to output json/xml/etc. Adding unknown fields back into proto3 in a new version now pollutes the output and potentially breaks known schemas.

Those formatters probably drop unknown fields. I haven't looked at them, but without a descriptor that mentions the field, all they have is a tag number and wire type (e.g. varint); they can't know the correct name, type (e.g. int32/int64), if it's a repeated field, etc., so they have limited options when outputting formats that expect textual field names and such.

In the official C++ client, there's no global/per-file/per-message knob for unknown field behavior AFAIK. You can explicitly drop unknown fields on a particular object when desired via myproto.mutable_unknown_fields().Clear().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants