-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
json
: support integer minimum, maximum, exclusiveMinimum, exclusiveMaximum
#7797
Conversation
{"type": "integer", "minimum": 3}
root ::= ([0-2] [0-9]{1,15} | [3-9] [0-9]{0,15}) space This rule matches strings like |
@p-e-w Oops good catch, thanks! That was one of the "heading zeros allowed" I had on my radar, hadn't realized it was also letting through bad numbers. Fixed. |
* Adding simple bare-bones test for end-to-end integration test for json validation against auto-generated JSON-schema grammars. * Adding additional examples as documented in #7789 . Also adding the ability to automatically output improperly failing grammars to debug output files so they can more easily be examined in the gbnf-validator program. * Uncommenting formerly commented tests so that they fail for others who are attempting to reproduce the bugs. * Merging improved schema test methods added by @ochafik in #7797 * Adding #define to temporarily remove failing tests so that this PR can pass CI, but still be useful for other PRs that want to leverage the framework. * Fixing nits from ochafik. Removing escape slashes, adding additional failing cases, fixing some other strings. * Fixing grammar indentation to be consistent throughout file.
json
: support integer minimum, maximum, exclusiveMinimum, exclusiveMaximumjson
: support integer minimum, maximum, exclusiveMinimum, exclusiveMaximum
tests/test-grammar-integration.cpp
Outdated
); | ||
|
||
test_schema( | ||
"min 0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This schema looks identical with the "simple min 0" test from line 154. Can they be merged together to shorten this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
tests/test-grammar-integration.cpp
Outdated
@@ -800,6 +948,165 @@ static void test_json_schema() { | |||
} | |||
); | |||
|
|||
test_schema( | |||
"min -123 max 42", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This schema looks identical to the one from the "simple min -123 max 42" test on line 249. Can they be merged together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, reshuffled / merged them all, thanks!
common/json-schema-to-grammar.cpp
Outdated
@@ -16,7 +16,7 @@ static std::string join(Iterator begin, Iterator end, const std::string & separa | |||
|
|||
static std::string repeat(const std::string & str, size_t n); | |||
|
|||
static std::string build_repetition(const std::string & item_rule, int min_items, int max_items, const std::string & separator_rule = "") { | |||
static std::string build_repetition(const std::string & item_rule, const int min_items, int max_items, const std::string & separator_rule = "") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're changing min_items
to const
, is it worth adding const
to max_items
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped unintended change, thanks!
tests/test-grammar-integration.cpp
Outdated
@@ -12,6 +12,9 @@ | |||
#include <cassert> | |||
#include <string> | |||
#include <vector> | |||
#include <json-schema-to-grammar.h> | |||
|
|||
using json = nlohmann::ordered_json; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a fault of the way I merged the code, but we don't need these 3 lines because json-schema-to-grammar.h
is already included, and this using
line is already here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops sorry bad merge, fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
…aviour of libstdc++ vs. clang's libc++, can't read final NULL char w/ former)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice defensive programming!
Very excited to see these changes getting merged in!! :) 🚀 🚀 🚀 |
* Adding simple bare-bones test for end-to-end integration test for json validation against auto-generated JSON-schema grammars. * Adding additional examples as documented in ggerganov#7789 . Also adding the ability to automatically output improperly failing grammars to debug output files so they can more easily be examined in the gbnf-validator program. * Uncommenting formerly commented tests so that they fail for others who are attempting to reproduce the bugs. * Merging improved schema test methods added by @ochafik in ggerganov#7797 * Adding #define to temporarily remove failing tests so that this PR can pass CI, but still be useful for other PRs that want to leverage the framework. * Fixing nits from ochafik. Removing escape slashes, adding additional failing cases, fixing some other strings. * Fixing grammar indentation to be consistent throughout file.
…Maximum (ggerganov#7797) * json: support minimum for positive integer values * json: fix min 0 * json: min + max integer constraints * json: handle negative min / max integer bounds * json: fix missing paren min/max bug * json: proper paren fix * json: integration test for schemas * json: fix bounds tests * Update json-schema-to-grammar.cpp * json: fix negative max * json: fix negative min (w/ more than 1 digit) * Update test-grammar-integration.cpp * json: nit: move string rules together * json: port min/max integer support to Python & JS * nit: move + rename _build_min_max_int * fix min in [1, 9] * Update test-grammar-integration.cpp * add C++11-compatible replacement for std::string_view * add min/max constrained int field to pydantic json schema example * fix merge * json: add integration tests for min/max bounds * reshuffle/merge min/max integ test cases * nits / cleanups * defensive code against string out of bounds (apparently different behaviour of libstdc++ vs. clang's libc++, can't read final NULL char w/ former)
…Maximum (ggerganov#7797) * json: support minimum for positive integer values * json: fix min 0 * json: min + max integer constraints * json: handle negative min / max integer bounds * json: fix missing paren min/max bug * json: proper paren fix * json: integration test for schemas * json: fix bounds tests * Update json-schema-to-grammar.cpp * json: fix negative max * json: fix negative min (w/ more than 1 digit) * Update test-grammar-integration.cpp * json: nit: move string rules together * json: port min/max integer support to Python & JS * nit: move + rename _build_min_max_int * fix min in [1, 9] * Update test-grammar-integration.cpp * add C++11-compatible replacement for std::string_view * add min/max constrained int field to pydantic json schema example * fix merge * json: add integration tests for min/max bounds * reshuffle/merge min/max integ test cases * nits / cleanups * defensive code against string out of bounds (apparently different behaviour of libstdc++ vs. clang's libc++, can't read final NULL char w/ former)
Building upon #6640, I went on a wild goose chase wondering whether one could support numeric bounds with grammars, and I'm relatively happy with the provisional outcome.
This is still a WIP, here are some TODOs:
Possible follow ups:
/cc @HanClinto this since you mentioned it in #7789 :-D