Skip to content

json: refine whitespace rules to avoid runaways #7866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 11, 2024

Conversation

ochafik
Copy link
Collaborator

@ochafik ochafik commented Jun 11, 2024

Quick follow up to #7841 (@HanClinto I took the bait of your "good start" 🤪)

Defining whitespace as ws ::= | " " | "\n" [ \t]{0,20} allows compact inline {"a": 1}, spacy inline { "a" : 1 } and indented JSON, but disallows multiple empty lines, multiple spaces not in an indenting context, etc.

For instance, this removes the trailing spaces generated by the following call:

./main --log-disable --seed 1133 \
  --grammar-file grammars/json.gbnf \
  -mu https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct-Q8_0.gguf \
  -p "Tell me a love story with a JSON structure:
"
Show output from master
{
  "characters": [
    {"name": "Alex", "age": 25, "occupation": "Software Engineer"},
    {"name": "Maya", "age": 28, "occupation": "Graphic Designer"}
  ],
  "story": {
    "beginning": "Alex and Maya met at a coffee shop in the heart of the city.",
    "middle": "They bonded over their shared love of art and technology, and soon became inseparable.",
    "end": "After a year of dating, Alex proposed to Maya with a custom-made ring and a romantic sunset view."
  }
}





  





  



Seems to perform similarly to master with the couple of attempts I've done.

Show benchmark commands
hyperfine \
  --warmup 1 --runs 5 \
  -L branch master,json-ws \
  --prepare 'git checkout {branch} && make clean && make -j LLAMA_CURL=1 main' \
  'branch={branch} \
    ./main --log-disable --seed 1133 \
      --grammar-file grammars/json.gbnf \
      -mu https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct-Q8_0.gguf \
      -p "Tell me a love story with a JSON structure:
      "'

Note: also tested ws ::= | " " | "\n" (" "{0,10} | "\t"{0,10}) but it's slower, probably because of the extra stack / alternatives overhead.

@github-actions github-actions bot added testing Everything test related examples python python script changes server labels Jun 11, 2024
@ochafik ochafik changed the title json: refine whitespace rules to avoid runaways json: refine whitespace rules to avoid runaways Jun 11, 2024
@ochafik ochafik marked this pull request as ready for review June 11, 2024 01:09
Copy link
Collaborator

@HanClinto HanClinto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! 👍

@ochafik ochafik merged commit b61eb96 into ggml-org:master Jun 11, 2024
43 of 55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes server testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants