Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional default for message fields #521

Closed
pvdrz opened this issue Aug 18, 2021 · 11 comments
Closed

Optional default for message fields #521

pvdrz opened this issue Aug 18, 2021 · 11 comments

Comments

@pvdrz
Copy link

pvdrz commented Aug 18, 2021

Hi,

I'd like to know if its reasonable/feasible to make message fields use T instead of Option<T> when deserialized if T implements Default. For example, these messages:

message Bar {
    inner uint32 = 1;
}

message Foo {
    Bar bar = 1;
    uint32 baz = 2;
}

Could be deserialized as

#[derive(Default)]
struct Bar {
    inner: u32,
}

#[derive(Default)]
struct Foo {
    bar: Bar,
    baz: u32,
}

I wonder if recursive messages would be a problem though.

@wendajiang
Copy link

I think this because proto3 spec re-support explicitly add label optional

FYI. https://github.com/protocolbuffers/protobuf/blob/master/docs/implementing_proto3_presence.md

@theduke
Copy link

theduke commented Apr 15, 2022

Any particular reason this was closed, @LucioFranco ?

I second this request.

I'm aware of the implications of field presence, but the Option<_> around message fields is the main blocker for directly using prost generated types in idiomatic Rust code - which is the main appeal of prost.

As is, you either have to live with awkward/confusing code, or always add an extra type with manual mapping.

This could potentially be supported with a Protobuf custom option, or a manual setting in prost-build (globally or even per field).

@LucioFranco
Copy link
Member

Can you expand more on

the Option<_> around message fields is the main blocker for directly using prost generated types in idiomatic Rust code - which is the main appeal of prost.

I don't personally see this as a blocker for writing idiomatic code and that this models the way protobuf handles messages pretty well in the context of a type that is used to deser from the wire.

@theduke
Copy link

theduke commented Apr 15, 2022

This is mostly a problem for me where the Protobuf types are heavily used in business logic, not just for GRPC or other de/serialization.

Fields with messages are very common, often to reuse types between messages. And often (at least in my use cases) these fields are guaranteed to be set, either because breaking changes are extremely unlikely and would require larger coordination across the codebase anyway, or because validation is done at the GRPC boundary.

But when the corresponding Rust type is Option<_>, it:

  • gives readers of the code the impression that the fields are actually optional and not always present
  • mandates sprinkling unwrap() / unwrap_or_default() / .get_mut().unwrap() etc everywhere - which ends up being very unidiomatic and confusing

As mentioned I usually remedy this by creating extra types with Into / From implementations and doing the mapping at the boundary (GRPC , file IO layer, etc).

But that's a whole lot of work for something that would be fixed by Prost just generating the types as non-optional.

(If you can be convinced I'd happily look into implementing this.)

@LucioFranco
Copy link
Member

While I do think using theTryInto/TryFrom is correct in this instance, it is a lot of work. Though as you said this work only needs to happen once and its quite type safe. It also provides a boundary for you to do validation etc.

I would like to hear from others though cc @neoeinstein @danburkert

@xiaoas
Copy link

xiaoas commented Sep 2, 2022

While I do think using theTryInto/TryFrom is correct in this instance, it is a lot of work. Though as you said this work only needs to happen once and its quite type safe. It also provides a boundary for you to do validation etc.

It would be really helpful if some of the fields can be set using default / result in a deserialize error / panics. In most of the cases I just need to create a second struct(which is not backed by the proto file and needs manual maintainence) and unwrap every single Option there.

@xiaoas
Copy link

xiaoas commented Sep 8, 2022

So after some discussion on discord I decided that it would be helpful to discuss why would generating fields without wrapping in an Option would be helpful:

  1. in prost basically every structured field is a Option. Unwrapping before using is not very fun imo (also I seem to be not the only one frustrated here). Especially when a single .unwrap() escalates to .as_ref().unwrap() chains. This is a real headache for non-business fan project writers like me where edge cases donesn't really matter: my protobuf is sent from js code and my rust code just blows up when js doesn't send a field: I'm good with that.
  2. I guess that in every API design, there's some field that is guaranteed to be there: not providing certain field is a breakage in the interface already, and that's very very unlikely. In extreme cases where we really won't get a field anymore, adding back the sophisticated None handling logic is not too much trouble.
  3. Imagine the use case where people want soundness in their code, while saving the trouble (and the performance penalty albeit probably slight) of calling unwrap everytime when they want to actually use the field. The only route they could take is probably as what @theduke has mentioned: define a second struct.

While this could be done ergonomically(With something like the TryFrom trait), there's a memory copy and the impl of TryFrom scales up linearly with the complexity of the struct. To neglect the memory copy, one could do

/// imagine Foo is generated by prost-build
struct Foo {
    pub bar: Option<Bar>,
}
struct Foo2<'a> {
    pub bar: &'a Bar2,
}
impl<'a> From<&'a Foo> for Foo2<'a> {
    fn from(value: &'a Foo) -> Self {
        Self{
            bar: value.bar.as_ref().unwrap().into(),
        }
    }
}

Which is even more complicated than moving the fields into the new struct. Also the user will now need to keep the old struct around.

As a comparison, serde allows declaring default attribute on a field, which allows the deserialization to still happen despite the field is not there: serde fill with the default given in struct definition. So serde allows it's user to choose: whether deal with a deserialization error, or declare the incentive and tell serde that they want a Default::default(). In my opinion this doesn't give away prost's guarantee.

From prost-build's implementation's perspective, I guess we could go with the current solution where user gets to define prost::default on a field just like defining any other attribute. That's less than ideal, but I guess it would be another story.

An altenative way is to implement something like TryFrom: if the field is not in Option and it's mssing, the deserialization will fail. Users need to use the new failable API if they want to get rid of Option while still being reasonably panic-free.

Sorry I'm not good at rust at all, actually I struggled to write the From implementation above. But I'm trying to share my point of view using the prost library. The documentation is great, but my experience turns down when I tries to use the struct prost generated.

@LucioFranco Any feedback is welcome.

@Tudyx
Copy link

Tudyx commented Jan 10, 2023

This can give more insight about why messages are wrapped in Option<T>. But I agree, it’s also very painful for me to define a struct and a conversion function each time there is an unwanted optional.

@Tudyx
Copy link

Tudyx commented Jan 11, 2023

Maybe supporting Google field_behavior could be a solution for this?
The syntax is like that :

message Foo {
  string name = 1 [(google.api.field_behavior) = REQUIRED];
}

Its wildly use at Google, for instance in the pubsub api.

For instance, if we take the google API example from the doc, if we take the UpdateTopicRequest from pubsub:

// Request for the UpdateTopic method.
message UpdateTopicRequest {
  // Required. The updated topic object.
  Topic topic = 1 [(google.api.field_behavior) = REQUIRED];

  // Required. Indicates which fields in the provided topic to update. Must be
  // specified and non-empty. Note that if `update_mask` contains
  // "message_storage_policy" but the `message_storage_policy` is not set in
  // the `topic` provided above, then the updated value is determined by the
  // policy configured at the project or organization level.
  google.protobuf.FieldMask update_mask = 2
      [(google.api.field_behavior) = REQUIRED];
}

The generated code could be something like that :

/// Request for the UpdateTopic method.
#[allow(clippy::derive_partial_eq_without_eq)]
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct UpdateTopicRequest {
    /// Required. The updated topic object.
    #[prost(message, optional, tag = "1")]
    pub topic: Topic,
    /// Required. Indicates which fields in the provided topic to update. Must be
    /// specified and non-empty. Note that if `update_mask` contains
    /// "message_storage_policy" but the `message_storage_policy` is not set in
    /// the `topic` provided above, then the updated value is determined by the
    /// policy configured at the project or organization level.
    #[prost(message, optional, tag = "2")]
    pub update_mask: ::prost_types::FieldMask,
}

We can optimize by removing the option hence we know it's a required field.

@shizzard
Copy link
Contributor

shizzard commented Mar 31, 2024

Hi,

This issue seems to be the most complete thread about the Option<T> problem for nested fields.

Problem with the optional field, that is logically required, is that it conflicts with the idiomatic Rust approach: you should not have data structures that will allow you to have your data in invalid state. For now the most idiomatic way to handle nested Option<T> is to have a set of mirrored data structures with a bunch of TryFrom implementations. This is okay, but this approach forces you to write and maintain ridiculous amount of boilerplate (a small gRPC service with 7 rpcs required me to write about 1 kloc of TryInto boilerplate; this is not counting the backward conversion).

Now, the protobuf. Protocol buffers developers decided to get rid of the required fields for a reason: required fields apply message constraints that are hard to maintain if your contracts (e.g. proto files) are changing, and they change a lot; also gRPC is not the only place protobuf is used in. So the idea behind removing required fields was the following: if you need some field to be required, you have to do the validation on application level, not protocol level. I think that while end users still have some ways to deal with the optional fields, Prost is a valid place to make such primitive validation, by adding some mechanism of marking the field as logically required, as Prost is generating the application-level code, and such mechanism will not be in conflict with gRPC/protobuf principles.

However, this issue been here for a while, and it seems there are still no decision on this matter.
@LucioFranco please comment, I would like to hear your thoughts.

@jiacai2050
Copy link

I think that while end users still have some ways to deal with the optional fields, Prost is a valid place to make such primitive validation, by adding some mechanism of marking the field as logically required, as Prost is generating the application-level code, and such mechanism will not be in conflict with gRPC/protobuf principles.

This totally make sense for me, I wonder maybe we should mention this in README?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants