Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leading Comma in JSON Array #99

Closed
NJordan72 opened this issue May 14, 2024 · 4 comments
Closed

Leading Comma in JSON Array #99

NJordan72 opened this issue May 14, 2024 · 4 comments

Comments

@NJordan72
Copy link
Contributor

This one might be user error, but I was trying to understand how the JsonSchemaParser worked and wrote a quick and dirty loop to generate random JSON from a parser.

Some of the time it works fine, but other times I get invalid JSON. The most notable/reproducible of these errors is when an array of objects is generated with a leading comma [, {object},...] It appears that in some cases a comma is an allowable character after the array/list has been started.

There is some randomness involved in the code below but if I run in 4-5 times I can usually reproduce it.

from typing import List
from pydantic import BaseModel, Field

from lmformatenforcer import CharacterLevelParserConfig, JsonSchemaParser

import random
import json

class TreeNode(BaseModel):
    name: str = Field(max_length=4)
    children: List["TreeNode"] = Field(max_items=2)
    
class Result(BaseModel):
    tree: TreeNode
    
parser = JsonSchemaParser(Result.model_json_schema(), config=CharacterLevelParserConfig(max_consecutive_whitespaces=1))

result = ""

while True:
    allowable = parser.get_allowed_characters()
    
    if not allowable:
        break
    
    choice = random.choice(allowable)
    parser = parser.add_character(choice)
    result += choice
    
try:
    json.loads(result)
    print(result)
except:
    print(f"Invalid JSON: {result}")
@noamgat
Copy link
Owner

noamgat commented May 14, 2024 via email

@NJordan72
Copy link
Contributor Author

NJordan72 commented May 14, 2024

I am using 0.10.1

Here is the invalid output...

{"tree":{"name":"sJGl","children":[{"name":"czJ.","children":[{"name":"/sb!","children":[]},{"children":[],"name":"e*2g"}]},{"name":"u0Yp","children":[,{"children":[{"children":[,{"children":[{"name":"9q,+","children":[]}],"name":"83,M"}],"name":"J,gG"}],"name":"@B.}"}]}]}}

Notably there are two leading commas.

FWIW this is the Pydantic JSON Schema:

{'$defs': {'TreeNode': {'properties': {'name': {'maxLength': 4, 'title': 'Name', 'type': 'string'}, 'children': {'items': {'$ref': '#/$defs/TreeNode'}, 'maxItems': 2, 'title': 'Children', 'type': 'array'}}, 'required': ['name', 'children'], 'title': 'TreeNode', 'type': 'object'}}, 'properties': {'tree': {'$ref': '#/$defs/TreeNode'}}, 'required': ['tree'], 'title': 'Result', 'type': 'object'}

@NJordan72
Copy link
Contributor Author

As I iterate through you can see that after opening the [ that ]{, are all eligible and that comma seems wrong.

Allowable: c
Allowable: h
Allowable: i
Allowable: l
Allowable: d
Allowable: r
Allowable: e
Allowable: n
Allowable: "
Allowable: :
Allowable: [
Allowable: ]{,

@noamgat
Copy link
Owner

noamgat commented May 17, 2024

Should be fixed in v0.10.2, please reopen if the issue persists.

@noamgat noamgat closed this as completed May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants