-
-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syntactic fallback validation for "format" #54
Comments
Performance concern can be addressed by providing several alternative implementations, e.g. 'fast' date validation would consider '33/13/2016' as valid relying on |
Yes, it just seemed like establishing a consistent minimum level would be good, and could be done without adding much of a burden. |
This could also be implemented in the meta-schema in such a way that the regex check always occurs, but that seems like a bad idea to impose on either resource-constrained implementations, or on implementations that offer alternative validation mechanisms that may be both more correct and more performant. |
I don't see this getting addressed in draft-07. If anyone wants to make that happen, please speak up. |
This has more or less been addressed by how I'm hoping to push for an alternative approach to the open-ended nature of |
The Problem: "format" is frequently re-implemented using "pattern" because it is unreliable
The "format" keyword is currently defined as an optional feature of JSON Schema. This frees implementations from the relatively burdensome requirements of performing the specified semantic validations, but also intentionally makes the feature unreliable. As a result, schema authors frequently re-define validation schemas for fields that could be completely described with the "format" keyword were its implementation consistent.
This places an undue burden on schema writers who wish to both take advantage of any full implementations and work around any minimal implementations.
Here is an example of a document (written in YAML for human-friendliness) the provides JSON Schemas for ipv4 and ipv6 addresses for use in other schemas from the same product in place of the "format" keyword:
https://support.riverbed.com/apis/sh.common/1.0/service.yml
The Proposal
JSON Schema can provide a standard "pattern"-based schema for each format value in its meta-schema, which will provide a documented level of purely syntactical validation for instances. This requires only trivial additional work from implementations as shown below under "Mechanism".
Each such schema MUST successfully validate against all possible valid instances. They MAY also successfully validate invalid instances due to the limits of regular expressions or the decision of the JSON Schema standard that the full pattern is too complex or has too much of a performance impact to support at all.
Mechanism
A "formats" section would be added to the "definitions" within the meta-schema:
The purpose of the nested "definitions" section is to clearly differentiate between definitions used only for format validation and definitions used to build the actual meta-schema.
If an implementation does not handle "format": "ipv4" directly, then the schema:
should be interpreted as:
combining the fallback schema with whatever schema elements beyond "format" were already present.
Correctness Concerns
While all of the formats can be at least somewhat validated by regular expressions, several are either extremely complex to fully validate or cannot be entirely validated by a regex. Is this a problem? I argue that it is not, because properly implemented this provides substantial validation assistance that schema authors are otherwise writing each time themselves. Schema authors may examine the supplied regexes and determine whether or not they are sufficient for the given application, and re-implement them accordingly if they are not. This is no worse than what currently happens.
Performance Concerns
Due to the complexity of the regular expressions involved, the performance impact of using them is a valid concern. However, the "format" specification already states that implementations SHOULD provide an option to disable the keyword. That requirement should be left as-is. Disabling the "format" keyword should disable it entirely, including the fallback validation.
The text was updated successfully, but these errors were encountered: