-
-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing JSON Schema's Complexity #710
Comments
I'm locking this issue to avoid speculatie discussion before I've had a the chance to form a proper response. |
Thank you @ucarion for your thoughts and previous contributions to JSON Schema. I want to start of by using an illustration. Most of the issues you have raised show that you are seeing things only from one perspective, and not the whole picture. The illustration: https://twitter.com/semestasains/status/1081106334634786816
Almost all people? What are you basing this on. If that were true, we needn't have created and released draft-5 though 7, or put in a tone of work on draft-8.
Very few people...? Again, what is your evidence of this? I have been been engaging with the community for the past 5 years on a nearly daily basis, and haven't seen this. Often the questions that come in, present the more complex and extensible schemas.
Evidence? Here's a list that support at least draft-5, and usually draft-7.
I've watched 3 people create a new implmentation on and off over the past 6 months or so. It is hard because it's complex, and it's complex because data is complex, and data is complex because that's real life sometimes. Before addressing each of your possible suggestions... What we are hearing is, "[You've made this too complex since draft-4, and haven't listned to what the community needs.]" What are you basing this statment on? Have you looked at how much discussion and listening has gone on to make some really key decisions lately? Have you looked at how we listen to the community, validate feedback, and make changes to the specification documents? Your statement is hurtful to the team and given the amount of work they have put into carefully listenerning to the community, which has sometimes resulted in adding things they are not individually happy with. SO MUCH of the work we do is a direct result of community needs based on real people asking real questions. Your justification for raising your issue doesn't present any evidence, and makes several wide sweeping statments about the community and implementations.
Simply, no. There are still isues that need resolving that the community needs, and we cannot address all of them in one draft. In addition, draft-8 is "nearly done"(tm) and adds a whole load of new things (such as vocabularies) which is going to be really important and interesting for the community. It's a much needed change, evident in what we've seen in the community. @handrews has done an epicily fantastic job of putting it together, and it's going to solve many problems across multiple current use cases. We will need people to test and feedback on this to make any required alterations in draft-9. Now, let's look at your suggestions:
Hyper Schema is a separate specification, which is why you find it in a separate document. Anyone creating tools to work with JSON Schema may never read Hyper Schema. Heck, their JSON may never be connected to the internet (and there are several in production use cases).
Why? Annotations are useful to many. That's like saying "Remove the ability to have comments from [programming language]".
We have done this for draft-8, but provide several different formats, all of which have valid use cases, and are not all required to be compliant.
The issue here goes back to your perception and limited use case (that illustration link at the top). Core provides many applicator key words, which can be used by other specifications (like Validation and Hyper Schema) to provide additional functionality. This is where understanding vocabularies is really important. Say an organisation or group want to form a JSON Schema Form standard, which extends JSON Schema, and is uninterested in Validation. If you had a unified Core and Validation spec, they would have to unpick the bits they required from it for applicators and annotations. Yuck. Wouldn't it be better if they could extend a... core part of how JSON Schema works, in terms of applicability, and could ignore all the Validation key words? Now of course, you might not care about form generation, but the community does.
We've talked about this a lot recently, but it still makes sense to allow this, and there are many in production use cases. It CAN be and often IS quite confusing. The easiest way to think about it is, if a non root schema has an Consider a set of schemas which combine together using $refs to create an effectivly single schema file. If I want to automagically transclude schemas into a single schema, I need to include the
The reason For draft-7, you can work round this by wrapping For draft-8, we HAVE actually made this change, because it was a frequent issue, specifically when you wanted to add to the annotations of a generic type you were referencing. See #523 for details. I worked on this myself.
Why? I might want to encode an image or other file type data in a JSON. I should be able to express what the format is.
I hope you're starting to see a little bit of the "other side of the picture" here. Finally, I think we should look at the document you linked to regarding your implementation. Specifcally the things you avoid doing for various reasons.
Wrong. JSON Schema does not assume you should be able to access the internet at all, nor that a URI is a network locator, which means "accessable on the network / internet".
$ref: https://tools.ietf.org/html/draft-handrews-json-schema-01#section-8.3
Not sure what you mean, as you don't elaborate.
You've conflated two issues here. We've talked about the use of
As mentioned, this is changed for draft-8. For draft-7 there is a workaround. By implementing the specification incorrectly, you can expect to be causing problems for other people using your library. Just because you don't know or understand the use case, it doesn't mean there isn't one. Please don't claim you support draft-7 and yet deliberatley implement things incorrectly. I've rejected adding implementations to the sites listing for this reason on numrous times. People will come to us and or you with bug reports. Just please do not do this.
Explain what you mean by "poorly". It's optional anyway.
Sounds like something implementation specific to me. eh shrugs
That's up to you, but again, schemas should be portable, and if you can't or choose not to support a key word, you should throw an error doing so, and not silently ignore or "fix" things. I'd like to also address some other comments...
I hope I've shown you why that assertion is simply false.
As above.
You're not the first person to say "we need this as an RFC now please". We get that. There are likely muiltple Github issues relating to RFC status. It's not a priority for us now. We tried to start it, and were kicked back, due to huge missunderstandings. None of us have the energy to start that again. We have a spec with issues that need fixing. I hope you feel my response is balanced and fair. If you have any evidence to provide supporting various statements, that would be great. As I've menitoned, some of the issues you've raised have merit, or have already been worked on, so that's not to say that nothing here is useful! |
Sometimes standards efforts find themselves on the wrong track, and in these cases it's appropriate to submit a long-form response to re-consider the problem being solved and how to solve it. But you need to be very careful, where we are right now is the result of many years of efforts of trying to create a solution that's well-defined and works for the most number of cases. @Relequestual has good answers to most of the issues you raise. I've got just a few footnotes: It's perfectly appropriate to be complicated in some cases, the whole point of JSON Schema is so that applications don't have to implement their own validation routines. If authoring documents is complicated, then try to identify a specific problem, research if it's been brought up before, otherwise file an issue if not. Deliberately varying from the specified behavior is in poor taste, if you think there's a problem, raise an issue; or else specify that it's for research purposes and not for use in production. Finally, half the point of this GitHub organization is to be a forum for implementors so we can converge on behavior. Make sure you're part of the discussion when the discussions happen! (The other half is to get to RFC, but there's not much we'd actually get out of an RFC number besides a registered media type.) |
Hi @Relequestal, I'll address your points one-by-one, and then finish by pressing the issue once more. I'll remind you that my central argument is that:
I'll avoid collapsing this point, because it's important:
I am not here to be hurtful. I have at all times been tactful yet forthright. We are here discussing technical ideas, not insulting one another. Indeed, in an effort to be pithy, I did not weigh down my opening remarks with tomes of evidence. But this reply does contain such evidence, which I hope we can engage with and that you will find apt and satisfactory. To your point, however, asides such as these:
Are wholly unnecessary, and perhaps a bit unprofessional. In ReplyNow, to address your comments: On your retort to "Why Simplify", point-by-pointYou open by stating that I don't furnish proof for my claims in "Why simplify", and by way of retort, appeal to your experience through Slack and StackOverflow. Consider:
I am indeed questioning the direction we are taking with draft-08. There are many people who don't understand why JSON Schema is still a spec, or why it's going where it is. To name a few places where people are complaining:
I don't think I need to go further here. Between StackOverflow (linked above), and just Googling for people with gripes with JSON Schema, far more people are complaining about complexity than about lacking functionality.
Check out the GitHub issues for the projects listed there. Almost all of them have open tickets about bugs related to
The real world is complex, but JSON Schema is not helping. Citing an article listed above:
Moving on, you state:
I am not here to say you're not listening to people. More correct would be to say that I think JSON Schema is trying to be all things to all people. Consider Vonnegut:
On that metaphor, I'm asking if we could shut the window. There are lots of problems to solve out there -- how about we solve just one, but really really well? On your retort to "One alternative approach", point-by-pointNow, to address your comments on my proposed way foward:
Yes. But my suggestion is that it be divorced from this project, so that it does not interfere with JSON Schema validation, which is the crown jewel of this project.
Again -- you can still have annotations. Just like comments, they don't do anything. Most programming languages don't start off natively supporting special behavior in comments. They're just ignored. That's what I'm suggesting we do. I don't see anything in the annotations that require formalism within the spec. There isn't anything concrete we can formalize about
Yes. I'm saying better would be to have one output format that actually is widely-supported, instead of four formats which everyone will do a desultory job of implementing.
I have not forgotten about UI generation from JSON Schemas. I have colleagues are who doing exactly this as part of their job. My contention is that we don't need to formalize keywords like No need to attempt to formalize how annotation works beyond that. It's acceptable if the annotation use-case is achieved in an ad-hoc fashion, as typically it ends up needing to be closely integrated with things outside of JSON Schema's purview, like external data sources or particular UI technologies. I'll therefore ignore comments suggesting that my proposal would regress or abandon UI generation. I believe it does not.
"Yuck" would be to muddle JSON Schema in order to solve for problems nobody has yet. Let's fix real problems, that people today have. As Oakeshott would say: let's prefer the familiar to the unknown, the sufficient to the superabundant, present laughter to utopian bliss.
It's more than merely confusing. On its present definition, Our intention is to fix this by formalizing what is and is not a sub-schema, a solution at odds with our attempts to make JSON Schema generalizable, because we'll end up locking down all possible "applicator" keyword fprms. My suggestion is that we instead cut this infernal Gordian knot.
I believe this is another instance of a problem nobody really has. It is not a terrible burden upon implementations to support taking a list of schema objects, instead of only supporting a single object.
I'm aware -- I was making a point about draft-07, since draft-08 remains a moving target. I would have clarified this, my bad.
This strikes as another instance of being everything to all people. Do you expect all validators to support all MIME types and content encodings? On your retort to json-schema-spec-comparison, point-by-point
Indeed, the spec says that. But the test suite, which is where the rubber meets the road, disagrees. It expects that validators somehow know how to assign an The test suite therefore presumes that implementations auto-assign The test suite therefore requires that validators do precisely what the spec suggests they should not do.
I do elaborate, here. But the reply above explicates this as well.
A "validation" which is "optional" is a poor validation. In that sense, it is defined poorly.
Our comments on a recommended subset of regular expressions are where the real benefits lie in practice. But that's not my point there. I'm saying that the spec should avoid requiring insecure behavior, such as emulating the behavior of a regeular expression langauge that is susceptible to ReDoS. Prior ArtFinally to address:
Simplicity is the high bar this project presently fails to meet. This has been stated many times to the authors of this project:
That's @timbray (among other things, one of the co-authors of the original XML spec)
That's Pezoa et al, "Foundations of JSON Schema": http://gdac.uqam.ca/WWW2016-Proceedings/proceedings/p263.pdf
That's @epoberezkin (in #160 -- that entire thread is damning, though.) I hope that the fact that I know of all these examples might serve to alleviate your concerns that I might not be appropriately informed. I've come late, but I've done my homework. In ConclusionTo conclude: I believe it would be highly ill-advised for the maintainers of this project to continue to ignore concerns that @timbray, a most preeminent IETF editor, and @epoberezkin, the author of the most popular JSON Schema implementation, both seem to have independently arrived at. I therefore press my case again: Are we sure we don't want to pursue simplification? |
I'm not one for the long discussions that we tend to have these days, with lots of back and forth and a huge number of comments to follow, but just on one little tiny piece here (and I will probably then unsubscribe to be honest, because I don't really find this issue helpful):
I think as the maintainer of the test suite I can say I agree with neither your premise nor your conclusion there :) -- the test suite is not where the rubber meets the road, it simply represents what my brain translated the spec into as an executable format, and if it has bugs, we fix them. If you think the spec says something different from the suite, that's a bug, please file a PR, say what's different from the spec, and it will be merged. Not sure what your point here is though, the test suite makes no such assumptions, so if you do do that, please elaborate on exactly what part of the suite does that. |
Here's the central point I'd like to get at:
I don't necessarily disagree, but you're going to have to come up with something specific and actionable. Saying "JSON Schema is too complex" is not, by itself, actionable. I'm familiar with most of the arguments you presented. For example, iirc, Tim Bray was talking about draft-4, to which I spent a great amount of time addressing with draft-5. (Also, XML isn't really a bastion of simplicity, either. see: billion laughs attack; see: downloadable DTDs; see: escaping CDATA sections inside CDATA sections; see: literally any time you want whitespace to be significant) First, which specific problems are there for schema authors? Second, which changes can fix these problems? Finally, for each of the problems, consider finding the relevant issues in the tracker, or filing new ones. I see a handful of specifics, but it's awkward to talk about all of them in a single issue. Pick one that's important to you and let's work through it in a new issue, or an email/Slack thread. |
On mobile so this one will be brief. @ucarion, I’m noticing a few trends in your assertions which are making this conversation tricky. You make claims which have no evidence, which are in fact contrary to the reality we see daily in the community and have for years, then you dispute the fact that we see these things come up on a regular basis as though we are purely making an appeal to authority and not the anecdoat evidence it was offered as. You: “Nobody wears hats.” Another complication is you are defining everything in terms of simplicity. Simplicity is a vague term, and you seem to be defining it as “features I want and need for my use cases” so anything outside of what you want and need is seen as unsinole cruft that should be removed, ignored, or used later. This is of course not the same definition of simplicity the contributor team should use for this project, or it wouldn’t be wildly applicable to many people. Bugs existing in older implementations is not evidence of failings in current JSON Schema, especially seeing as the older tools are not being actively maintained (that’s why they’re stuck on draft 04). Newer versions of JSON Schema (7 and 8) have done a great job of making the language more clear and concsie, and understandable to the layperson. Anyone, onto your examples in prior art. These we’re the best part of your post. @timbray had concerns about $ref and error output. As Ben has already said, $ref works the way you want it to work in draft 8, and the language has been simplified since 4/5. Error outputs are also a thing, which Ben also already said. Tim should be content with more recent changes, and should be happy to know that BC breaks are stabilising in later versions as most of the work has been clarification and addition. The discrepancies between drafts should be moving closer together. Re: The Foundations of JSON Schema: yep, you have had answers explaining that RFC number would be nice to have but also having our very own v1.0 would be equally nice to have, and we’re working on that. Their concerns will be solved when their are no longer drafts. Drafts are required to flesh out ideas, otherwise you’re just flopping stuff out on the public and that’s no good for anyone. Then yes, there’s #160 and a lot of related issues. If you think that fella has been ignored then you’ve not spent much time on the issues here. We’ve dedicated months to trying to resolve discussions with that person. So, your concern is that other famous people have concerns, and those concerns are:
A call to action: can you define simplification in a more useful fashion? Suggesting the contributors do not want to pursue simplification is to assert we’re trying to make this unexcessairily complex, and that couldn’t be further from the truth. Maybe you could help identify some wording in draft 8 that could be simplified, and in the form of a pull request improve the spelling? All without ripping out keywords that people actively use, because that would cause some fairly major discrepancies in implementations, and that’s something we strive to avoid unless absolutely necessary. 👍🏼 |
@ucarion you've gone to some effort to isolate and frame negative comments on JSON Schema, but in terms of validating your view that post-draft-04 work it too complex, it doesn't hold up. As @awwright noted, that post by Tim Bray is years old and specifically referring to draft-04. @awwright did a fantastic job of making things a lot more straightforward to read and understand, and we have continued to build on that as we've clarified and added examples based on feedback. Indeed, Tim was one of the more encouraging voices in our otherwise dismal discussion with the JSON mailing list. (Other people who have published RFCs have also been encouraging- that thread is not our only discussion of the topic). Note that one reason for creating an output format in draft-08 is Tim's comments in that discussion:
This was useful feedback which we have acted upon. Because when someone who knows how to write an RFC gives us specific, constructive feedback (even though it was by his own admission outdated), we pay attention to that. As for your comments regarding stackoverflow, popularity is in part a function of time. And duplicate questions get closed. The most popular questions are about draft-04 because that is what has been around long enough to get popular. @Relequestual's experience monitoring stackoverflow reflects what people ask about now, and what implementations they are using regardless of whether their question is specific to a newer draft or not. Your dismissal of that experience in favor of a metric that has more to do with time than features is not convincing. Regarding the Hacker News link, you extracted a single comment from a much larger thread that was started by someone praising JSON Schema. While the full thread has some people debating the roles of JSON Schema vs TypeScript and the like, it also includes a very long sub-thread of people saying that yes, they use JSON Schema, and what they use it for. While I'm sure some of them use draft-04, there is a notable absence of complaints over there being newer drafts. The one post that you specifically referenced is only confused over Hyper-Schema. As has been stated many times in many places, Hyper-Schema is a separate specification and is not impacting or blocking the Core or Validation specifications in any way. People are welcome to find Hyper-Schema irrelevant. But what people think of Hyper-Schema has nothing to do with Core or Validation. The last draft of Hyper-Schema was not even published at the same time as the last drafts of Core and Validation. You later say:
It does not interfere in any way with JSON Schema validation. How many times do we have to say this to you?
You clearly have not looked at our history and all of the things we have turned down, or shunted to other projects. This is hilariously off-base.
You have clearly not tried to implement something that relies on these (and similar) keywords heavily. Other people have. Your dismissal of their use cases is unconvincing.
@gregsdennis (who wrote and maintains an implementation) put in a heroic amount of work to gather feedback and incorporate input from a wide variety of people, primarily other implementors, to produce those formats. You, on the other hand, are just asserting that it's all wrong. We will use the extensive work done to produce that proposal and get feedback on it. If we get feedback that not all four are used, we will act on that feedback. This is what drafts are for.
This is an incredibly common problem. There are tools out there that just do this, and nothing else. The most popular JavaScript one gets nearly 400,000 downloads weekly from npm. Why do you think you can just assert all of these other use cases away? Your arguments are completely counter to measurable reality.
Formal extension is a problem that many people have. That is why we have spent so much time and effort on it. In particular, UI generation, code generation, and API documentation generation are not well-served by current use cases. And since we actually do refuse to make JSON Schema everything to everyone, we have drawn hard lines against adding features to support those use cases. However, there is a great deal of demand for using JSON Schema as a base for these sorts of things. There are numerous very popular web form libraries (Mozilla maintains one, and there are others for both Angular and React). API documentation and code generation are major use cases for OpenAPI, which I hope you are aware is very popular. I work directly with the OpenAPI Technical Steering Committee on converging their use of JSON Schema with future drafts. The
I'm not sure who "our" is supposed to refer to? I also have no idea what you mean in general. There is nothing that locks down "applicator" keyword forms. There's no solution to this yet published, so I have no idea how you can know that it will cause such a problem. Are you saying you want to make JSON Schema generalizable? But you also want to throw out all of the work done to establish patterns of keyword behavior which is explicitly done to make it generalizable?
No, and the spec notes that such a thing would be prohibitively difficult. Doing any sort of validation with This has the same problem as optional If you and Evgeny want to go have a pure validation with no As for the rest, your assertions of what is and is not a real-world use case reveal a profound ignorance of what people actually ask for. None of it is convincing. |
Hi, I notice my name being bandied about quite a bit here. I would like to be able to use JSON Schema, but have disliked earlier drafts. Is now a good time to take a look at the current draft? BTW I'm super glad to hear that you're working with the OpenAPI people. Having them keep up with your progress is A Good Thing. |
My name is branded here too - good club it seems. The language could have been a bit nicer, but I guess all deserved. On the core subject of $id’s - I can confirm that very few JSON Schema users outside of some small silos of really advanced users use IDs inside schemas (I.e. not in the root) and/or understand how base URI change works and why it is needed.
I’d really like to see which validators would pass Ajv tests for all $ref scenarios - no JS validator was passing them (it’s not to say that Ajv is any better because of that - it’s just to support that the current $ref spec is very complex to fully implement, and I gave up on fixing some of the rare edge cases): https://github.com/epoberezkin/test-validators
They are indeed optional, but the problem is that ALL users expect them to work in a certain way, often in a different way from the spec (particularly when it comes to uri and email). Supporting formats is a constant source of learning for me (for example, I would not have known that 23:59:60 is a valid time otherwise). But I don’t think removing them is a viable option at this point, even though they cause more contention than all other keywords. As an idea, maybe it is possible to include very formal and simplified/permissive definitions in the spec that can be expressed as simple regular expressions (instead of relying on complex definitions in specific RFCs that most validators implement as convoluted regular expressions anyway, but inconsistently) and let end users either redefine them if they need more restrictive validation or to capture the errors outside of JSON schema. |
@timbray thanks for commenting! We are currently wrapping up the latest draft, which focuses on establishing a consistent processing model and classifications of schema keyword behaviors, so that JSON Schema as a system can be extensible. I need to do a read-through of the whole thing again (now that all of the major changes are in), and then we'll probably do some re-working of the sections for better logical flow and readability. For example, the "Overview" section has gotten far too long to qualify as an overview! After that (hopefully in the next week-ish now that I've finally had time to focus on this again), we will put the result up for final review for a couple of weeks, and then publish it as the next I-D. If you're interested in taking a look during the pre-publication review period we would love to get your feedback, or you can wait until it's published and comment then. The final pre-publication review is mostly for ensuring readability- unless someone spots an egregious problem, any substantive changes will be deferred to the next draft (this one is already months late due to personal life interfering). I consider this upcoming draft of the Core and Validation specifications to be nearly feature-complete. There are some glaring unresolved issues around extensibility, but we decided that the best way to address those was to get feedback on the parts we have worked out. There's a lot to it already and while we've had good participation here there is no substitute for people trying it out in Real Life (tm). The other major unresolved thing is providing some predictability around what are now very unpredictable "optional" validation behaviors ( This draft that we are about to publish addressed the really big questions that caused the project to stall several years ago, so we are hoping that we are over the hump and now just tying up known loose ends. As @ucarion notes, all of this has introduced complexity, but we believe that a.) if you still want to implement a plain validator, you can do that, and b.) the most complex aspects only impact people designing extension vocabularies, or writing a full-featured extensible implementation. And there are major use cases, such as OpenAPI, who are interested in that extensibility. |
@timbray regarding the usage pattern (switching schemas on a type field), the idiom that is the most straightforward (although verbose, and see below for discussion of declarativeness) is something like: {
"type": "object",
"oneOf": [
{
"if": {"properties": {"schemaType": {"const": "foo"}}},
"then": {"$ref": "#/$defs/foo"},
"else": false
},
{
"if": {"properties": {"schemaType": {"const": "bar"}}},
"then": {"$ref": "#/$defs/bar"},
"else": false
},
{
"if": {"properties": {"schemaType": {"const": "biz"}}},
"then": {"$ref": "#/$defs/biz"},
"else": false
}
],
"$defs": {
"foo": {...},
"bar": {...},
"baz": {...}
}
} There was much debate over whether the The above example can be written as: {
"type": "object",
"oneOf": [
{
"allOf": [
{"properties": {"schemaType": {"const": "foo"}}},
{"$ref": "#/$defs/foo"}
]
},
{
"allOf": [
{"properties": {"schemaType": {"const": "bar"}}},
{"$ref": "#/$defs/bar"}
]
},
{
"allOf": [
{"properties": {"schemaType": {"const": "baz"}}},
{"$ref": "#/$defs/baz"}
]
}
],
"$defs": {
"foo": {...},
"bar": {...},
"baz": {...}
}
} Which idiom you prefer is probably a matter of stylistic preferences plus the error reporting behavior of your implementation. As noted earlier in this thread, the forthcoming draft also proposes standardized error reporting behavior, which we hope will improve the quality and consistency of error reporting in implementations. |
@epoberezkin thanks for commenting! Good to hear from you.
It would have been more accurate for me to say "it is possible for pure validators to be in full conformance, despite non-validator features having been added".
I filed #54 for this idea back in 2016 :-D
This is exactly what we mean when we talk about just making I am pretty sure that one or the other or both of these ideas will be a feature of the draft after this one. We are also looking at improving the extensible vocabulary support enough to let meta-schema authors control optional behavior, e.g. "if you can't guarantee full validation of this, then refuse to process this schema at all". The worst part of |
Yeah I definitely agree that in terms of humans writing schemas, that feature is very, very rarely used. I do want to point out that the "advanced users" case includes automated tooling, which I would argue is what the feature is really for. I have used that feature to package schemas into a single easily distributable file, even though the schemas are developed in many much smaller files which are easier to work with by hand and in version control. One thing we might want to do is make those use cases more clear. We have put more information into the spec about when and why you would use these things, but perhaps if it was explicitly clear that base URI changing is primarily for programmatic tools, people who are just writing schemas by hand would be less confused over it. And people writing tools and implementations would understand why it's there a bit better. This is something we can look for during the final wording review for this draft. |
@jdesrosiers had some excellent ideas on |
Apologies for the tone I assume was implied by this aside. Apologies I did not give it full consideration. I was attempting to demonstrate that we did in fact take a lot of care to listen to community needs. You're correct, we are here to dicuss techincal issues. You're right to call that out. Whatever our disagreements, we want to be welcoming of open discussion. To mention one specific comment...
This issue I present is actally EXACTLY what OpenAPI (formally Swagger) did, creating a sub/super set of JSON Schema, because they wanted to exclude, redefine, and, to JSON Schema. It created no end of problems with OpenAPI implementations using JSON Schema implementations "as is". I'm hoping that a lot of the discussion has been helpful here. Assuming you do create new issues for each point, copy relevant discussion over, and create a comment here which links to those issues. Once you've done this a few times, I'll go ahead and close this issue. I'll leave it unlocked, on the basis that discussion on specific issues will be discussed in OTHER new or existing github issues. No one has the headspace for multi issue mega threads. Realstically, because of the phase of draft-8, we aren't going to make any grand changes now. I feel that, and as an attempt to summarise others comments also, the evidence you present doesn't stack up, or is outdated, relating to previous drafts of JSON Schema. |
@timbray Great to see you here. I'm hoping @handrews's example was helpful. |
I see you got the expected reaction. "You don't understand! There's a Good Reason for all this complexity!" |
@bobfoster do you have any more convincing arguments, or are you just here to tell us that we don't know what we're doing? |
I think this issue can be safely closed out. But for posterity, I think it could be good to agree on why we're closing it. @handrews, @awwright: would you be ok with closing this ticket? I'll add my summary to the top of the ticket. On my reading, in summary:
I have no doubt this approach will work, but it will take time -- when you do more, there's more to get right. There perhaps exists room for a far more modest variant of JSON Schema, more aligned with the aims I've proposed in this ticket. Sorry this ticket had to take the form of a sort of FOSS ninety-five theses! But my goal here was to change the zeitgeist, not shave off things at the peripheries. That's why this wasn't a specific issue about a particular feature, but instead a proposal about how we think about all features. |
@ucarion There are several variants of XML schemas, I don’t see why there should be only one JSON schema. In my experience, an absolute majority (feels like 99%) of the current users needs a small subset of features well aligned with what was suggested in this ticket, with the exception of formats (on which I commented too - I believe formats in schema should not mean any other RFC, instead they should mean a shorthand for some agreed simple regular expression that is much more permissive and simply does structural rather than semantic validation of strings, in a way similar to how JSON schema does structural rather than semantic validation of data structures). The only viable reason I see for the base URI change and JSON pointers to exists is the desire to bundle multiple schemas into a single file. Unfortunately many users believe that it is possible to bundle multiple schemas into a single schema by substituting refs - it is not possible to do in a static way in general case. A simple bundle being an array of schemas is possible though and already supported by several validators - defining it in the spec and making its support mandatory would eliminate the need for any other bundling, base URI changes and JSON pointers. I agree that a radically simplified spec is long due, whether it happens in the current group or outside. While there is a growing number of JSON schema users, there is a much bigger number of developers who do not use it. The current level of complexity is a serious blocker for JSON schema adoption. |
Closing per @ucarion's last comment. |
Note: Opinions expressed herein are entirely my own and not the views of my employer.
Whereas previous drafts of JSON Schema have focused on extending, bugfixing, or generalizing JSON Schema, I would like to propose that the focus on the next iteration of JSON Schema be on reducing complexity.
Why simplify
On the current track, it would take a nontrivial amount of time for JSON Schema to reach the high bar of formality and clarity that the IETF RFC process requires. But the industry needs JSON Schema now. This is a testament to the importance of what this project is working on today.
By focusing on simplifying JSON Schema, and focusing on those problems we know we can solve for users, we will be able to make something people really need. Consider the following:
Almost all people who will use JSON Schema have had everything they need since draft-04. For most people, we could have stopped there.
Very few people need a sophisticated, extensible, hypermedia-driven validation framework that's IETF-standardized. Lots of people need a standardized, reliable schema language that works on all of their different platforms and systems identically.
Most implementations of JSON Schema are out of date and buggy. For example, almost none of them support
$ref
100% correctly.format
is super unreliable. A ton of implementations are stuck on draft-04.It's very hard to create a new implementation of JSON Schema. The spec, when read from A-to-Z, is confusing -- and takes very long to read, since the spec is now a multi-thousand-line formalization spanning three documents.
Time is not on our side here. JSON Schema is nine years old. With each passing draft, we are creating a new generation of divergent, out-of-date implementations. As time passes, those implementations will ossify and require a new generation of deprecations and re-writes.
Many contributors to this project note, aptly, that this project is a volunteer effort, and that it's impossible to punctually achieve our ambitious aims entirely on our spare time. The solution is not to take another few years to get this project done. The solution is to focus on what's already out there, formalize that, and wrap this thing up.
One alternative approach
I have implemented a simplified approach to JSON Schema through the form of a test suite, and two implementations which pass it:
For a detailed overview of what the differences in this approach are, see:
The above document focuses on differences in test suites. But JSON Schema has many details which it does not concretize in tests. On the approach I've implemented, we could take the following actions to make the spec simpler:
$id
outside of root documents.$ref
disable its sibling keywords.format
,contentMediaType
, andcontentEncoding
.Doing so would leave us with something that's backwards-compatible with what most people are using JSON Schema for today. The biggest pain-point will be for people who use
$id
inside sub-schemas -- they will have to spread their schemas across multiple documents.This is just an illustration of the idea, which I've complemented with working code, because we reject kings and presidents. The point here is that simplification can be achieved, and it can be done in a way that doesn't unduly harm our core constituency.
Nobody can ever be forced to change. But on my proposal, those who elect to will likely not find that much of anything has changed. And those upgraders will be joined by a new generation of enterprise users, who cannot use JSON Schema today for lack of formalization and off-the-shelf implementations.
Conclusion
This ticket is not an open-ended diatribe. This ticket asks the following question: shall we change the overarching objective of this project to be cutting scope and simplifying? Shall we make our prime directive be to have, by the next draft, something that can be accepted as an IETF RFC?
Afterword
In summary, the answer to the above question is "no", to the extent that anything based on rough consensus can ever be decided. In more detail:
The intention of this issue was to discuss whether JSON Schema should make IETF standardization its prime directive, and focus on simplification as the instrumental means of achieving that end.
JSON Schema remains ultimately a project on the basis of rough consensus. And there does not today exist many people on this project with enthusiasm for wrestling with standards bodies.
Nor is it evident that JSON Schema can or ought to dramatically cut scope. Though there are many people who could live with just a small subset of JSON Schema that the project has long supported, there are also many people who want everything that's in the spec present, imminent, and future.
Therefore, JSON Schema shall not change its focus. The current trajectory -- of making a sophisticated, generalizable, extensible system for validating and annotating JSON-like data -- shall remain the course.
The text was updated successfully, but these errors were encountered: