Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove or document field set implementations for extensions and unknown fields #687

Open
osa1 opened this issue Jun 22, 2022 · 2 comments
Open
Labels

Comments

@osa1
Copy link
Member

osa1 commented Jun 22, 2022

Currently we have 3 field set implementations:

I don't understand why we need 3 different implementations and it's not documented. Conceptually we just need to map integers (field tags) to proto values. As far as I understand, the only difference between a known field and an unknown field is that for unknown fields FieldInfo won't be available. We don't need this much code for this.

We will probably need a new value class (like PbMap, PbList) for unknown length-delimited values (strings, bytes, messages, repeated fields). The representation will just be Uint8List.

I'm not sure if we need to support groups, but we will need another value class for unknown groups, with Uint8List to store the group contents.

When we need to check whether the field for a tag is known, unknown, or extension, we could have Set<int> unknownFields and Set<int> extensionFields in _FieldInfo.

One question is how to merge two unknown groups or length-delimiteds. Merging values is not specified in proto spec. We have two options:

  • Collect values of merged groups and length-delimited stuff. This means PbUnknownLengthDelimited (or whatever we want to call it) will store List<Uint8List> instead of just Uint8List, and add to the list as we merge values. Same for groups.

    This is currently what we do in UnknownFieldSet.

  • Override the current value. This is simpler.

If I'm missing something and we really need 2 extra classes and a few hundred extra lines and tests then we should document why this is needed.

@osa1 osa1 added the question label Jun 22, 2022
@sigurdm
Copy link
Collaborator

sigurdm commented Jun 23, 2022

I don't know the history for why these are represented separately - but I agree - it is better to gather these into one FieldSet class - that avoids inconsistencies between them. (They could even potentially be folded into GeneratedMessage - but I think there's a good reason not to do that: All the calls on FieldSet can stay monomorphic - if they are calls on subclasses of GeneratedMessage they might go polymorphic...)

@osa1
Copy link
Member Author

osa1 commented Aug 2, 2022

UnknownFieldSet is not a private type and we have a few users. I checked the users, we have two use cases:

  • To generate protobuf binary messages without a proto description. For example:

    UnknownFieldSet()..addField(
        1,
        UnknownFieldSetField()..addVarInt(123)
    ).writeToBuffer()

    If we had a public field type class/enum, we could use CodedBufferWriter for this use case: (not tested)

    CodedBufferWriter()..writeField(1, FieldType.int32, 123).toBuffer()

    (See Introduce a class for field types #605 for adding a public field type class)

  • To skip fields in custom protobuf deserializers:

    UnknownFieldSet().mergeFieldFromBuffer(tag, input)

    (where input is CodedBufferReader)
    For this use case we could provide a CodedBufferReader to skip fields. I don't think we have anything else other than UnknownFieldSet that covers this use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants