-
Notifications
You must be signed in to change notification settings - Fork 9.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(De-)Serializing null, required, and empty values ins OAS parameters #2037
Comments
Can we get some guidance on this? From my reading of RFC 6570 that empty list, empty dict and null are all interpreted the same way. When undef is None or [] or {} then having {undef} in a path is rendered as empty string for type=simple. About your nullability interpretation |
Some Possible Paths forward here, which would require breaking changes to openapi:
Note: |
While it may not be the ideal solution, the guidance in PR #3840, which is basically "define it as a string and have your application pre-format it", is really the only thing I can think of shorting of inventing a whole new standard for stringification. I don't think anyone has the time/resources to create such a standard, but there are also many reasons to allow the ambiguity. OAS's success is partially due to being able to describe existing APIs, which no doubt handle this in a variety of contradictory ways. I think the best we can do is highlight which RFCs are relevant at which time (see also #3818 and other forthcoming PRs in this area, particularly around percent-encoding and using |
Oh and see also #3812 regarding |
PR merged for 3.0.4 and ported to 3.1.1 via PR #3921! |
(De-)Serializing null, required, and empty values ins OAS parameters
At a couple points in the last year, I've tried to understand how to serialize and deserialize parameter values based on an OAS definition. The mainstream cases are all very straightforward (which is great). However, I'm trying to write a generalized processor using an arbitrary OAS definition as a guide, and there are "corners" of the spec that are not immediately clear how to interpret.
This document summarizes my understanding. I think OAS is both great and really handy, so I'm hoping that by writing it down in this much detail, it will lead others to point out flaws in my thinking, or guide others who are struggling with the same questions ( e.g. issue #1915 ). Or both!
A detailed summary exists in the "Parameter serialization" section, further below.
Preliminaries
Materials
Ideas
I tend to think of all the data (payloads and parameters) as having a canonical representation in a JSON-compatible structure (by that I mean, a data structure exactly representable by a JSON string). While I don't like positing abstract intermediate systems, I find it helps reasoning about the translation from an application's own data structures to the world that the OAS specification defines, and from that to the final serialization format (JSON, application/x-www-form-urlencoded, RFC6570, etc). In particular, the path from application to the canonical representation is largely outside of the scope of my discussion here. That transformation, itself, is arbitrarily complex and not knowable by anyone but the developers of that application. My discussion here is focused on going from that JSON-compatible representation to a serialized format and back.
empty
A notion that is important to think about here is that of "empty". This emerges from RFC6570, and simply means a string with 0 elements (characters/octets). RFC6570 appears to only work with strings, lists/arrays of strings, or associative arrays (key/value pairs) and undef (discussed below). As such, to understand empty, you must first serialized your values into strings (and the elements of a list to strings, and the values of your key/value pairs into strings). Given the OAS data types, the only values that can be empty are those with type/formats
string
(with no format), orstring
/byte
,string
/binary
,string
/password
.undef
RFC6570 also has a notion of "undef". This is not explicitly named in the OAS spec, but it seems unambiguous that it plays an important part in parameter (de-)serialization. It is described (in section 2.3 of RFC6570) as:
Furthermore:
In our JSON-compatible abstract data structure, this means
null
,[]
,{}
, and{ foo: null }
are all "undef" for RFC6570 purposes.While an empty value is serialized as an empty string (resulting in serializing
foo
andemptyParam
asfoo=bar&emptyParam=
), undef values are treated as if they had never been serialized in the first place (resulting in serializingfoo
andundefParam
asfoo=bar
).allowEmptyValues
One of the OAS properties that seems like it applies to these serialization questions is
allowEmptyValue
. However, the more you look at this, the more confusing (and probably even contradictory) it seems. tedepstein pursued a heroic effort to get clarity about this ( #1573 ). In his summary, he wrote:I, personally, find this mildly troubling since it means this "empty" is not the same as the empty from RFC6570. Because this is application dependent ("determined by the API provider"), it isn't clear to me that this actually has any applicability to questions of serialization and deserialization (from and to that JSON-compatible data structure) that I'm worried about here, and instead lives in the translation space from the application's data structures to the JSON-compatible ones.
Given this, and the fact that it is deprecated, my choice is to ignore it entirely in this discussion. This may be a foolhardy decision!
Parameter serialization
With all these preliminaries out of the way, let's look at parameter serialization. The first thing to note is that there are several properties that affect this:
As mentioned above, this is going to ignore
allowEmptyValues
.As I've worked through the various cases these allow for, I have come to feel that the two central challenges that need to be accounted for are:
null
(especially given thatnullable
property)In practice, the issue of handling empty values usually gets pulled into the discussion. However, this seems to only be a problem because of trying to answer these two questions.
The following sections present a long (long!) discussion of these questions. My summary of all that is, however:
A lot of the OAS parameter serialization rests on RFC6570. If you take a strict interpretation of the applicability of RFC6570, then null will always be conflated with the absence of a property (and sometimes with other things). In this realm of interpretation, the
nullable
property doesn't seem to buy you anything (it's value is always, in practice, determined by the value ofrequired
). It is worth noting that this does also have the effect of making therequired
property in the Parameter Object have a different meaning than that ofrequired
JSON Schema, since the latter does allow something to be both required and null.On the other hand, if you do not believe RFC6570 is completely controlling the serialization that OAS dictates, then you certainly have leeway to explicitly represent
null
in the parameter serialization schemes.The punchline here is that I do not think it is possible for to write an OAS-based parameter (de-)serialization system which does not cause some kind of data corruption when going from and then back to the aforementioned canonical JSON-compatible data system. The best you can do is to define some constraints on what data you allow to be fed into the system to start with. From an absolute perspective, this is very unfortunate. From a pragmatic and application-specific perspective, this is probably not actually a big problem (you simply need to be clear what your data transformation path is)
Note that in the tables below,
<no prop>
indicates the case where the property is not only not null, but the property itself doesn't exist in some way (equivalent to a JSON object{}
without the property key present). Also,prop: 'a'
is representative of any property with any non-empty primitive value (true
,65
,3.13e-5
,2019-10-20
, etc).Query Parameters
There are four styles for query parameter (de-)serialization:
style: form
For primitive values
We might expect these serializations, given the inputs on the top, and the rules on the left:
nullable=false
nullable=false
nullable=true
nullable=true
Even with this simple table, we have several problems:
required
andnullable
as describing different cases, then it is reasonable to talk about an absent property that is not null. And if you picture parameters as living in a JSON object, then it is completely reasonable to have an absent property. On the other hand, some will certainly find those notions nonsensical, in which case (2) and (3) aren't problems at all, thenullable
property has no meaning, and many of the problems listed below are not problems either.If your property is not an empty-able string, however, then the serialization table looks much less problematic:
nullable=false
nullable=false
nullable=true
nullable=true
This suggests that non-empty-able values are well-handled!
However, if you believe RFC6570 holds sway here, then this is still a problem since the null value should result in the same serialization as the property not being present:
prop: null
nullable=false
nullable=false
nullable=true
nullable=true
At this point, however, one can see that the value of
nullable
is redundant. We can simply ignore it. This leaves us with a much simpler table:prop: null
This leaves us with the only data corruption problem that the absence of a property and a property with a null value are conflated (which, again, may not actually be a problem in your world)
If, on the other hand, we wanted to ignore RFC6570 and allow
nullable
to have meaning, we can achieve this only with a data constraint like declaring that strings that haveminLength: 0
are invalid when a property isnullable: true
nullable=false
nullable=false
nullable=true
nullable=true
For a human, this is much more complex (and actually using an API that supported all these would likely be a bad developer experience), but for a machine it is fine.
For arrays with explode: false
With all the above discussion done, let's move on to the next case of
style: form
serialization: That of arrays.We again start off with a somewhat idealized table
nullable=false
nullable=false
nullable=true
nullable=true
Notes:
prop: []
). In RFC6570 this is the same as a null value.prop: [null]
is identical toprop: []
which is identical toprop: null
which is the same as not having the property.If we take the strict RFC6570 strategy, from the previous section (assuming an undef value means the property is not serialized at all), then we end up with the simpler:
prop: null
prop: []
prop: [null]
prop: ['a', null, '']
This leaves us with only a couple remaining problems:
nullable
in those schemas)If, instead, we take the non-strict-RFC6570 stance (which you would do if you want to be able to explicitly represent null), we can eliminate ambiguity if we add several constraints:
minItems: 1
if they are also nullableminLength: 1
if their parent array is nullableNote: This is not the only set of constraints that you can add to avoid ambiguity! For example, an alternate to the third would be "Array elements that are an empty-able string type must have a
minLength: 1
if their parent array is nullable, unless that parent array has at leastminItems: 2
.nullable=false
nullable=false
nullable=true
nullable=true
For arrays with explode: true
We again start off with this full table:
nullable=false
nullable=false
nullable=true
nullable=true
This results in the same set of drawbacks, and the same sets of ways to resolve this, as the explode: false` case.
For objects with explode: false
nullable=false
nullable=false
nullable=true
nullable=true
By now, you can probably see that if you do a strict RFC6570 interpretation, you'll collapse the first four columns into a single one and thereby be able to ignore the
nullable
property. Deserializing the absent parm name will be ambiguous, but other than that it is unambiguous.If you do a non-strict RFC6570 interpretation, then you'll need to add various constraints.
For objects with explode: true
nullable=false
nullable=false
nullable=true
nullable=true
Notes:
If you go with a strict RFC6570 approach (which again ends up making
nullable
irrelevant), this becomes much easier to interpret:prop: null
prop: {}
prop: { p: null }
style: spaceDelimited and style: pipeDelimited
These are much the same as one another.
The specification states that these can only be used with array values. It also implies in its table of renderings that it can only be used with
explode: false
. On the other hand, that table has many errors in it, so I'm not sure it is a reliable source of information. The presumably non-normative swagger.io page ( https://swagger.io/docs/specification/serialization/ ) suggests thatexplode: true
can be used, in which case this is the same asstyle: form
withexplode: true
. In that case, see above for that discussion.So, for the explode: false case:
nullable=false
nullable=false
nullable=true
nullable=true
These styles do not claim RFC6570 allegiance. Yet, they have the same creation of deserialization ambiguities that other array serializations produce. (Not to mention that having a
|
or a space in your value is going to lead to data corruption).To which, by now, you'll not be surprised to note that if we went with the same style as we are doing for the RFC6570 cases above, you'll pull all the ambiguity into the leftmost column, and not need the
nullable
property. Or you can ignore RFC6570 and use a bunch of constraints to avoid the problems.style: deepObject
The pattern for
deepObject
is much the same as forpipeDelimited
etc. I'm sure you won't mind not seeing yet another table here.Cookie Parameters
Cookie parameters only allow
style: form
. Ultimately, everything said above about query parameters applies here, too.Header Parameters
Header parameters only allow
style: simple
. The serialization is slightly different from what we saw withstyle: form
, above, because the property name is not pre-pended.Primitive values
Note that this may or may not apply to header parameters, since the OAS spec says in one place that this only applies to arrays, and in another says it applies to primitives, arrays and objects.
nullable=false
nullable=false
nullable=true
nullable=true
The OAS specification says that empty is
n/a
in its example table. I'm not sure what to make of that, particularly given (again) the number of errors in that table, and the fact that the only other mention ofn/a
is forallowEmptyValues
which in turn is only for query parameters.In this case, even if we take a strict RFC6570 interpretation (thereby rendering
nullable
to be meaningless), we still end up with:prop: null
Which means that when
required
is false, we can't distinguish between no property, null and empty.Both this and the non-strict RFC6570 interpretation require constraints to provide an unambiguous interpretation.
For arrays and objects
Evaluating these are left as an exercise for the reader.
path Parameters
Path parameters have one distinct difference from other parameters. Their
required
property must betrue
.style: simple
Path parameters can be serialized with
style: simple
, which was discussed above. Because required must be true, this would leave us with:nullable=false
nullable=true
Strict RFC6570 interpretation leaves this unambiguous. Non-strict requires constraints to avoid data corruption.
style: label
These are almost the same as
style: simple
, except that.
is used as a delimiter, and when a property has an empty value, it is written as.
rather than an empty string.All the discussion from the
style: simple
case can be applied here.style: matrix
These are almost the same as
style: form
, except that the delimiter is;
rather than&
, and when a property has an empty value, it is written asprop
rather thanprop=
.You can borrow the discussion from any other location to here.
Conclusion
I find it hard to believe you actually read this far. If you did, and have comments, I welcome them!
The text was updated successfully, but these errors were encountered: