-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: JSON Schema-to-GBNF additionalProperties bugs (and other minor quirks) #7789
Comments
…ing the ability to automatically output improperly failing grammars to debug output files so they can more easily be examined in the gbnf-validator program.
@HanClinto thanks for reporting these!
Agree! Been wanting to add a JSON schemas section within grammars/README.md, I'll crack at it and will probably also update the stale grammar examples. |
…ing the ability to automatically output improperly failing grammars to debug output files so they can more easily be examined in the gbnf-validator program.
…ing the ability to automatically output improperly failing grammars to debug output files so they can more easily be examined in the gbnf-validator program.
…ing the ability to automatically output improperly failing grammars to debug output files so they can more easily be examined in the gbnf-validator program.
* Adding simple bare-bones test for end-to-end integration test for json validation against auto-generated JSON-schema grammars. * Adding additional examples as documented in #7789 . Also adding the ability to automatically output improperly failing grammars to debug output files so they can more easily be examined in the gbnf-validator program. * Uncommenting formerly commented tests so that they fail for others who are attempting to reproduce the bugs. * Merging improved schema test methods added by @ochafik in #7797 * Adding #define to temporarily remove failing tests so that this PR can pass CI, but still be useful for other PRs that want to leverage the framework. * Fixing nits from ochafik. Removing escape slashes, adding additional failing cases, fixing some other strings. * Fixing grammar indentation to be consistent throughout file.
* Adding simple bare-bones test for end-to-end integration test for json validation against auto-generated JSON-schema grammars. * Adding additional examples as documented in ggerganov#7789 . Also adding the ability to automatically output improperly failing grammars to debug output files so they can more easily be examined in the gbnf-validator program. * Uncommenting formerly commented tests so that they fail for others who are attempting to reproduce the bugs. * Merging improved schema test methods added by @ochafik in ggerganov#7797 * Adding #define to temporarily remove failing tests so that this PR can pass CI, but still be useful for other PRs that want to leverage the framework. * Fixing nits from ochafik. Removing escape slashes, adding additional failing cases, fixing some other strings. * Fixing grammar indentation to be consistent throughout file.
What happened?
While debugging json-schema-to-gbnf grammars, I noticed a few bugs / quirks and wanted to write them down somewhere.
additionalProperties
seems to default tofalse
(not matching spec).By default, additional properties should be permitted. However, providing a schema like:
Then it correctly passes on these strings:
But then it improperly fails on the string:
This is clearly given in the json-schema docs as an example of a string that should match this schema, so we're doing something wrong.
Explicit
"additionalProperties"=true
behavior is even worse.If we change the above grammar to:
Then things really start to go awry. These strings should all pass (indeed, they passed before when we didn't explicitly set anything for
additionalProperties
, but instead are failing now:And our sample with an additional property still doesn't match:
The only string that matches out of the original is the empty object (
{}
).Looking at the generated GBNF, there is some weird stuff going on. Here is the GBNF with additionalProperties set implicitly:
And here is the GBNF from
additionalProperties
set explicitly totrue
:The key differences to note here are how
street-type-rest
is now being defined (even though it was never defined in the original), andadditional-kvs
seems to be getting appended to each property without a comma in between (nor an optional flag).I haven't yet wrapped my brain around what all is going on with that, but I wanted to lay out how far I'd gotten on my own.
Unlike strings, enums don't support spaces between properties and values.
This is definitely in the "quirk" more than "bug" category, but when using a schema like:
Then validating against it means that:
is a valid string, but adding spaces around the enum value causes either of the following to fail:
Interestingly, adding spaces around a string value works fine, and these match the generated grammar just fine:
Unsupported Attributes
We should probably build a list of unsupported attributes and note them in the documentation -- some that I've noticed thus far:
exclusiveMinimum
(probably can't be handled for anything except for special cases of 0, requiring either the presence of a-
or not)uniqueItems
-- not sure how we could support this without a regex engine that supports capture groups and lookbehinds and whatnot.Name and Version
version: 3093 (7672ade)
built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.4.0
What operating system are you seeing the problem on?
Mac
Relevant log output
No response
The text was updated successfully, but these errors were encountered: