-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leading Comma in JSON Array #99
Comments
Can you make sure that you are using the latest LM format enforcer? We
solved a few bugs like this in the past few versions.
If you are, can you attach an example of a json that the library allows but
is illegal, for the schema you show?
…On Tue, May 14, 2024 at 3:39 PM NJordan72 ***@***.***> wrote:
This one might be user error, but I was trying to understand how the
JsonSchemaParser worked and wrote a quick and dirty loop to generate random
JSON from a parser.
Some of the time it works fine, but other times I get invalid JSON. The
most notable/reproducible of these errors is when an array of objects is
generated with a leading comma [, {object},...] It appears that in some
cases a comma is an allowable character after the array/list has been
started.
There is some randomness involved in the code below but if I run in 4-5
times I can usually reproduce it.
from typing import Listfrom pydantic import BaseModel, Field
from lmformatenforcer import CharacterLevelParserConfig, JsonSchemaParser
import randomimport json
class TreeNode(BaseModel):
name: str = Field(max_length=4)
children: List["TreeNode"] = Field(max_items=2)
class Result(BaseModel):
tree: TreeNode
parser = JsonSchemaParser(Result.model_json_schema(), config=CharacterLevelParserConfig(max_consecutive_whitespaces=1))
result = ""
while True:
allowable = parser.get_allowed_characters()
if not allowable:
break
choice = random.choice(allowable)
parser = parser.add_character(choice)
result += choice
try:
json.loads(result)
print(result)except:
print(f"Invalid JSON: {result}")
—
Reply to this email directly, view it on GitHub
<#99>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKFA2CAJU76LN5YRT7HQMLZCIAYVAVCNFSM6AAAAABHWDEKOGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI4TKMZTGA4TMNA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I am using 0.10.1 Here is the invalid output... {"tree":{"name":"sJGl","children":[{"name":"czJ.","children":[{"name":"/sb!","children":[]},{"children":[],"name":"e*2g"}]},{"name":"u0Yp","children":[,{"children":[{"children":[,{"children":[{"name":"9q,+","children":[]}],"name":"83,M"}],"name":"J,gG"}],"name":"@B.}"}]}]}} Notably there are two leading commas. FWIW this is the Pydantic JSON Schema: {'$defs': {'TreeNode': {'properties': {'name': {'maxLength': 4, 'title': 'Name', 'type': 'string'}, 'children': {'items': {'$ref': '#/$defs/TreeNode'}, 'maxItems': 2, 'title': 'Children', 'type': 'array'}}, 'required': ['name', 'children'], 'title': 'TreeNode', 'type': 'object'}}, 'properties': {'tree': {'$ref': '#/$defs/TreeNode'}}, 'required': ['tree'], 'title': 'Result', 'type': 'object'} |
As I iterate through you can see that after opening the
|
Should be fixed in v0.10.2, please reopen if the issue persists. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This one might be user error, but I was trying to understand how the JsonSchemaParser worked and wrote a quick and dirty loop to generate random JSON from a parser.
Some of the time it works fine, but other times I get invalid JSON. The most notable/reproducible of these errors is when an array of objects is generated with a leading comma
[, {object},...]
It appears that in some cases a comma is an allowable character after the array/list has been started.There is some randomness involved in the code below but if I run in 4-5 times I can usually reproduce it.
The text was updated successfully, but these errors were encountered: