Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make skipping null values default behavior for protobuf #397

Closed
Egorand opened this issue Mar 5, 2019 · 3 comments
Closed

Make skipping null values default behavior for protobuf #397

Egorand opened this issue Mar 5, 2019 · 3 comments

Comments

@Egorand
Copy link

Egorand commented Mar 5, 2019

Based on #58.

My understanding is that since 0.10.0 there is support for skipping optional values, which can be enabled for JSON with Json(encodeDefaults = false), but it's not clear how to enable this behavior for ProtoBuf. In my opinion this should also be the default behavior for protobuf, since the only way to transfer null values is to omit them in the binary representation.

@sandwwraith
Copy link
Member

Protobuf case can be more complicated than JSON case since JSON can distinguish between null and absence of value, and protobuf not. Moreover, it does not have a concept of null itself.

Imagine you deserialize data class with Protobuf and it has field val x: MyData? = MyData.DEFAULT_VALUE. You don't see x in the input stream. What does it mean? If the other side was using kotlinx.serialization too looks like you have to use the default value. If other side using other protobuf implementation, should you use null? Or construct MyData instance with zeroed fields, as the spec says?

There is also a problem with val x: Int = 42. Usually, protobuf implementations set missing values to 0. But it conflicts with our desire to skip 42 in output stream: If we omit it, other clients will receive incorrect data.

Seems like the proper solution is to skip default values only if they're nulls or 0s. We'll try to investigate what can we do here and what consequences it can bring.

@Egorand
Copy link
Author

Egorand commented Mar 6, 2019

Unlike JSON, protobuf relies on explicitly defined schema, hence I believe some of the uncertainties you mention can be resolved by looking at the schema.

Let's take the following message:

message Employee {
  int32 id = 1;
  optional string name = 2;
  optional bool is_contractor = 3 [default = false];
}

With this schema, message recipients should always expect id value to be present in the message, while name and is_contractor can be omitted. When name is missing its value defaults to null, since no other default has been set. When is_contractor is missing it gets the value of false, which also implies that is_contractor can never be assigned a value of null. Hence, the following would be the correct way to represent Employee as a @Serializable data class:

data class Employee(
  @SerialId(1) val id: Int,
  @SerialId(2) @Optional val name: String? = null,
  @SerialId(3) @Optional val is_contractor: Boolean = false
)

Also, since protobuf always has schema, the code is usually generated, and the generator is responsible for ensuring that correct semantics are preserved for all fields. For instance, val x: MyData? = MyData.DEFAULT_VALUE would be incorrect, since x can either be a field with a default value, which would it make it non-nullable, or a simple optional field, meaning its default value should be null.

To sum up, the deserializer should always rely on the default value of a field if it's missing on the wire, and the code generator is responsible for properly translating proto messages into data classes, preserving the semantics of individual fields.

@jamesachn
Copy link

For instance, val x: MyData? = MyData.DEFAULT_VALUE would be incorrect, since x can either be a field with a default value, which would it make it non-nullable, or a simple optional field, meaning its default value should be null.

What about the case:

optional string name = 2 [default = "default_value"]

with

// Omitted @Optional since that is deprecated now for properties with default value.
@SerialId(2) val name: String = "default_value"

In this case when receiving an input stream without name in the payload, the expectation I think is to deserialize with name set to "default_value". I think this is fairly unambiguous as protobuf spec does not seem to enforce that sender and receiver even share the same default value:

Reference: https://developers.google.com/protocol-buffers/docs/proto#optional

When a message is parsed, if it does not contain an optional element, the corresponding field in the parsed object is set to the default value for that field.

It may induce questionable versioning story, but that's really dependent on the application usage of protobuf and is really application's own problem.

On the flip side (serialize for a class with optional field and default value), I think the enhancement behavior, if considered, can extend beyond just skipping if default value is null. If there is a default value, then the field is considered optional to kotlinx.serialization. Why restrict to just skipping null default values? As long as the object instance's optional field has a value that == the default value, it should be considered skippable. It is the receiver's job to decide how to deal with the lack of value for the optional field. If receiver is proto2, it will use its schema's default value. If receiver is proto3, it will use default value for that field type. If receiver is kotlinx.serialization, it will use the class definition's declared default value for that field. This seems consistent with how protobuf treats optional fields with respect to default values (i.e. if optional field is not in the payload, receiver fills in the blank according to its default-value behavior).

brudaswen pushed a commit to brudaswen/kotlinx.serialization that referenced this issue Jan 9, 2020
Porting a project using the *Java ProtoBuf library* to *kotlinx.serialization* can become quite hard since optional properties are not supported in the same way.

On the one hand it is not possible to omit optional properties during serialization (this is possible by not setting an optional property while building the message in Java).
On the other hand it is not possible to check if a property has been omitted by the sender during de-serialization (this is possible via `Message.hasA()` in Java).

Allowing to set `ProtoBuf.shouldEncodeElementDefault()` to `false` allows to create `@Serializable data classes` that support optional properties during serialization and de-serialization. Additionally, it makes `null` values possible for optional properties as well.

Closes Kotlin#397 Make skipping null values default behavior for protobuf
Closes Kotlin#71 NULLs are not supported when writing to protobuf
brudaswen pushed a commit to brudaswen/kotlinx.serialization that referenced this issue Jan 20, 2020
Porting a project using the *Java ProtoBuf library* to *kotlinx.serialization* can become quite hard since optional properties are not supported in the same way.

On the one hand it is not possible to omit optional properties during serialization (this is possible by not setting an optional property while building the message in Java).
On the other hand it is not possible to check if a property has been omitted by the sender during de-serialization (this is possible via `Message.hasA()` in Java).

Allowing to set `ProtoBuf.shouldEncodeElementDefault()` to `false` allows to create `@Serializable data classes` that support optional properties during serialization and de-serialization. Additionally, it makes `null` values possible for optional properties as well.

Closes Kotlin#397 Make skipping null values default behavior for protobuf
Closes Kotlin#71 NULLs are not supported when writing to protobuf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants