Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

absent JSON properties: explicit null value vs. omitting property #4

Closed
pvdbosch opened this issue Jan 6, 2020 · 4 comments
Closed

Comments

@pvdbosch
Copy link
Contributor

pvdbosch commented Jan 6, 2020

In GitLab by @wsalembi on Feb 23, 2018, 08:56

Decide on how to model absent JSON properties in a JSON document:

Decide on how to model empty arrays:

  • as arrayname: []
  • by omitting the property

Remarks:

  • VAS: uses null values

    • For GET, for self documentation purpose, we decided our REST APIs to always return all the managed properties in the JSon bodies. It means that if a property is null, the property will be in the JSon body. Reading a response body as then like reading the contract. ;o)
      In the following example, we see that the REST API returns 3 properties and will always return those 3 properties (another version of the API will maybe return more properties). But querying the REST API, you the client will discover which properties are returned for each resource.
      { "first_name": null, "last_name": "Vincent", "ssin": "00000000097"}
      To allow this, we created an Holder class and we configure the Jackson ObjectMapper by calling the method setSerializationInclusion(JsonInclude.Include.NON_NULL) (who tells to Jackson to only serialize not null fields: where Holderfields are null in our case).
      In some case, we agree the added null properties can "pollute" the response for a human reader, but the human will more often read the JSon body when he is developping and less often during the production time. We prefer then to add self documentation in the response instead of to clean the response. BTW, the properties defined in the Swagger specs are for all methods and it is not clear enough if the property is returned in a search on in a GET.
    • For PUT, if a property is missing, the property should be removed from the DB.
    • For PATCH
      • if a property is missing, the property should not be updated.
      • if a property is set to null, the property should be removed from the DB.
  • CBSS: strong preference to strip properties with null values

    • we have a lot of business data with a lot of optional fields
      => stripping properties makes a lot more readable (for operational purposes), and reduces body length
    • A lot of work to make difference to allow null values:
      • in programming: see example with Holder types in Jackson
      • stripping null values is the default in JSON-B
      • "nullable: true" would have to be added for every property in the OpenAPI/swagger specification
    • For discoverability of properties, use of swagger/openapi should be sufficient
  • FOD Fin:
    les APIs vendor que nous utilisons utilisent des explicit null values. Mais nous n’avons pas encore de standard explicite. Donc “omitting property” ne poserait pas de problème.

@pvdbosch
Copy link
Contributor Author

pvdbosch commented Jan 6, 2020

In GitLab by @wsalembi on Feb 23, 2018, 08:56

null isn't supported in OpenAPI 2.0, only in OpenAPI 3.0 when adding nullable: true to the definition of the property.

In JSON, there is a semantic difference between explicit null (value is null), and an absent property (value is undefined). I don't expect many clients would take into account this difference however.

The Paypal styleguide makes some good arguments on why to avoid differentiating between null and undefined:

In most strongly typed languages, such as Java, there is no concept of an undefined type, which means that for all undefined fields in a JSON object, a deserializer would return the value of such types as null when you try to retrieve them. Similarly, some Java-based JSON serializers serialize fields to JSON null by default, even though it is not possible for the serializer to determine whether the author of the Java object intended for that property to be defined with a value of null, or simply undefined.

For C# Web API, the (de)serializing has to be modified as well.

If we want to support JSON merge patch however, making a difference between null and undefined will be required.

For CBSS, we work with quite some data exchanges with sparse data types (i.e. many null values), e.g. vehicle register, pension register. Another 3rd party example is the OpenShift/kubernetes API. Human readability is important for us as a service integrator, because our developers and service desk have to work a lot with the in-transit data format. The larger size of the data structure may be an issue for bulk messages, e.g. an application exporting data on all Belgian citizens.

Concerning different 'views' on the same resource as mentioned by VAS:

the properties defined in the Swagger specs are for all methods and it is not clear enough if the property is returned in a search or in a GET.

From experience in our SOAP services:

  • for 'static' views depending on the call and not on the user, we create another data type with only the subset, e.g. in case a search functionality only returns limited subset of data. This makes the service description clearer and self describing.
  • for data elements filtered because the client isn't authorized, we reuse the data type and add a list of filtered data to the response. Example (JSON equivalent): "_filteredData" : ["career/capitalisation", "career/migrants"]. I haven't seen much use of this metadata by applications however, as the authorizations are quite static and clients know what they've access to in advance of using the API.
  • we try to avoid proliferation of too many views on a same data type, as clients want the representations to be consistent. If including the entire data type in an operation isn't desired for performance reasons, we often only return the resource identifier so clients can use another service to retrieve the resource data.

@pvdbosch
Copy link
Contributor Author

pvdbosch commented Jan 6, 2020

In GitLab by @wsalembi on Feb 23, 2018, 08:56

Introduction

I think it will be very difficult to decide if all REST APIs should remove null from JSon representations or not, because all the APIs are not the same and does not have the same complexity.

We already discussed about null POST, PUT and PATCH properties in the issue usage of PATCH/PUT method to update resources. We only have to discuss about GET here.

Object and simple properties

To strip the property

If an API with a lot of optional properties returns all fields, the representations are less readable by humans and the size of the messages increases.

To return non null values only has the following benefits:

  • it is easy to implement (it is the default behavior) and will keep DTOs classes simple (simple getters and setters).
  • the size of the messages are smaller (but the difference is only significant when the entity has a lot of optional properties).

The downside is that the documentation of the API has to explain when the property will be returned or not. And to maintain the documentation is something we do not always do.

To return null values

The goal is not to return all properties but only the properties the client can access and/or the GET method (with the given parameters) can return. The goal is to return the whole representation on the point of view of the client (depending on the request, the rights of the user, the state of the requested entity,...).

To return all properties has then the following benefit: the API is more self-documented. It the client does not receive the property, it means he does not have the right to receive it or it means this GET request does not return the property. There is no need to document it somewhere else and the clients and the developers of the REST API can easily see if a property is missing and identify a bug more quickly.

The downsides are:

  • there is some more code (see below) and more configuration (see below).
  • the size of the messages can significantly increase when the entity has a lot of optional properties.
  • JSonSchema2Pojo is not able to generate getters and setters that use Optional or any other holder class. It means that this is not compatible with contract first methodology.

For some projects, we use a library we created: https://git01.smals.be/vas/jackson-holder (snapshot of the code in the attached ZIP).

This library is more simple than Optional. The goal is to hold a value.

Here is a sample of code that use this library:

@JsonInclude(NON_NULL)
@JsonAutoDetect(fieldVisibility = ANY, getterVisibility = NONE, isGetterVisibility = NONE, setterVisibility = NONE)
public class Client {

    private Holder<String> ssin;

    public String getSsin() {
        return holderValue(ssin);
    }


    public void setSsin(final String ssin) {
        this.ssin = holderFor(ssin);
    }
}

Static methods holderValue() and holderFor() are defined in the class be.smals.vas.jackson.holder.Holder.

If we want to return the ssin property, we have to call setSsin() with a value or with null. Calling setSsin() with null will add the property in the JSon representation with a null value. If we do not call setSsin(), the ssin property will not be part of the representation.

The @JsonInclude(NON_NULL) annotation will configure Jackson to return non null values only, no mather how the ObjectMapper is configured.

The @JsonAutoDetect(fieldVisibility = ANY, getterVisibility = NONE, isGetterVisibility = NONE, setterVisibility = NONE) annotation will configure Jackson to read/inject the values directly in the properties (no usage of getters/setters). To make it works, the jackson-holder library defines a Jackson module that can be automatically discovered by calling the method findAndRegisterModules() on your object mapper. When parsing, this module will

This annotation can also be defined on an interface and all DTOs can inherits the annotation, which is more convenient.

Using the Holder class has also the benefits to allow to do fine grained manual PATCH because you can identify which property has been defined or not in the representation.

If the client use jackson-holder, he can also check if the property is given or not and identify if the user can access or not the information, by example. The user's WEB page can indicate if the information is restricted or not.

Remarks about nullable: true

We already have this "problem" with the PATCH method: the client has to define the property with null value to set the property to null in the database. Then I don't think we should take care about this "problem". Or we should change rethink about PATCH. Furthermore, OpenAPI 3.0 will fix it. :o)

Improvement: Lombock

Lombock can generate a lot of code (getters and setters by example) and it allows extensions. We can then create an extension to prevents the developers to write specific getters and setters as defined above.

Arrays

To strip the property

As for the objects and the simple properties, when the array is empty or null, we can strop the property.

To return null values

As for the objects and the simple properties, when the array is empty or null, we can return the property.

To keep the same behavior, I suggest to return null instead of [].

I did not had the time to test jackson-holder with arrays, but I expect that it will depend on how the ObjectMapper is configured (SerializationFeature#WRITE_EMPTY_JSON_ARRAYS).

My conclusion

I think the designer of the API should be able to decide which option he prefers. Depending the the size of the entities, if there is a lot of optional properties, if the representations depends on the users and/or the state of the entity,...

To return "all" the properties has some benefits but are not always useful. I think we should not force the API to return all the properties or only the non-null properties. We should probably give the both solutions and the API designer should choose the one that fits the needs.

@pvdbosch
Copy link
Contributor Author

pvdbosch commented Jan 6, 2020

In GitLab by @wsalembi on Feb 23, 2018, 08:56

Updated the styleguide.

It was decided to keep the rule that null values SHOULD be stripped, as this is the simplest solution, compatible with OpenAPI 2.0 and simpler to specify in OpenAPI 3.0.

The 'SHOULD' still leaves some room for exceptions. The JSON Merge patch is explicitly mentioned as an exception in the styleguide.

Empty arrays and strings are considered different from null values and are not impacted by the rule. No rule is specified for these; they can be allowed or disallowed depending on the specific use case.

@pvdbosch
Copy link
Contributor Author

pvdbosch commented Jan 6, 2020

In GitLab by @wsalembi on Feb 23, 2018, 08:56

closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant