Skip to content

Reducing JSON Schema's Complexity #710

Closed
@ucarion

Description

@ucarion

Note: Opinions expressed herein are entirely my own and not the views of my employer.

Whereas previous drafts of JSON Schema have focused on extending, bugfixing, or generalizing JSON Schema, I would like to propose that the focus on the next iteration of JSON Schema be on reducing complexity.

Why simplify

On the current track, it would take a nontrivial amount of time for JSON Schema to reach the high bar of formality and clarity that the IETF RFC process requires. But the industry needs JSON Schema now. This is a testament to the importance of what this project is working on today.

By focusing on simplifying JSON Schema, and focusing on those problems we know we can solve for users, we will be able to make something people really need. Consider the following:

  1. Almost all people who will use JSON Schema have had everything they need since draft-04. For most people, we could have stopped there.

  2. Very few people need a sophisticated, extensible, hypermedia-driven validation framework that's IETF-standardized. Lots of people need a standardized, reliable schema language that works on all of their different platforms and systems identically.

  3. Most implementations of JSON Schema are out of date and buggy. For example, almost none of them support $ref 100% correctly. format is super unreliable. A ton of implementations are stuck on draft-04.

  4. It's very hard to create a new implementation of JSON Schema. The spec, when read from A-to-Z, is confusing -- and takes very long to read, since the spec is now a multi-thousand-line formalization spanning three documents.

Time is not on our side here. JSON Schema is nine years old. With each passing draft, we are creating a new generation of divergent, out-of-date implementations. As time passes, those implementations will ossify and require a new generation of deprecations and re-writes.

Many contributors to this project note, aptly, that this project is a volunteer effort, and that it's impossible to punctually achieve our ambitious aims entirely on our spare time. The solution is not to take another few years to get this project done. The solution is to focus on what's already out there, formalize that, and wrap this thing up.

One alternative approach

Note: this suggested approach is merely illustrative. It is not formally part of what I'm proposing in this issue, but does prove a point.

I have implemented a simplified approach to JSON Schema through the form of a test suite, and two implementations which pass it:

For a detailed overview of what the differences in this approach are, see:

The above document focuses on differences in test suites. But JSON Schema has many details which it does not concretize in tests. On the approach I've implemented, we could take the following actions to make the spec simpler:

  1. Remove Hyper-Schema entirely.
  2. Remove annotation entirely. You can still have annotations, it's just not a standardized thing.
  3. Have a single suggested output format for errors.
  4. Unify the "core" and "validator" documents.
  5. Remove $id outside of root documents.
  6. Stop having $ref disable its sibling keywords.
  7. Remove format, contentMediaType, and contentEncoding.

Doing so would leave us with something that's backwards-compatible with what most people are using JSON Schema for today. The biggest pain-point will be for people who use $id inside sub-schemas -- they will have to spread their schemas across multiple documents.

This is just an illustration of the idea, which I've complemented with working code, because we reject kings and presidents. The point here is that simplification can be achieved, and it can be done in a way that doesn't unduly harm our core constituency.

Nobody can ever be forced to change. But on my proposal, those who elect to will likely not find that much of anything has changed. And those upgraders will be joined by a new generation of enterprise users, who cannot use JSON Schema today for lack of formalization and off-the-shelf implementations.

Conclusion

This ticket is not an open-ended diatribe. This ticket asks the following question: shall we change the overarching objective of this project to be cutting scope and simplifying? Shall we make our prime directive be to have, by the next draft, something that can be accepted as an IETF RFC?


Afterword

In summary, the answer to the above question is "no", to the extent that anything based on rough consensus can ever be decided. In more detail:

  1. The intention of this issue was to discuss whether JSON Schema should make IETF standardization its prime directive, and focus on simplification as the instrumental means of achieving that end.

  2. JSON Schema remains ultimately a project on the basis of rough consensus. And there does not today exist many people on this project with enthusiasm for wrestling with standards bodies.

  3. Nor is it evident that JSON Schema can or ought to dramatically cut scope. Though there are many people who could live with just a small subset of JSON Schema that the project has long supported, there are also many people who want everything that's in the spec present, imminent, and future.

  4. Therefore, JSON Schema shall not change its focus. The current trajectory -- of making a sophisticated, generalizable, extensible system for validating and annotating JSON-like data -- shall remain the course.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions