Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: llama-gbnf-validator parses grammar but gets a seg fault when validating an input string against the grammar #10321

Open
nissenbenyitskhak opened this issue Nov 15, 2024 · 8 comments
Assignees
Labels
bug Something isn't working critical severity Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)

Comments

@nissenbenyitskhak
Copy link

What happened?

I ran./llama-gbnf-validator mygrammar.txt mytestprogram.txtand after checking the grammar itself, it started to parse the test file and it went into an infinite loop calling static void llama_grammar_advance_stack() and eventually blew up in tiny_malloc_from_free_list()

mygrammar.txt
mytestprogram.txt
llama-grammar.cpp.txt

I modified llama-grammar.cpp to add some console debug statements, so the line numbers in the stack trace may be off a bit from the version I used. See the attached file llama-grammar.cpp.txt for my minor changes.

I found numerous bugs and problems with the validator, including these:

  1. The infinite loop noted above for the grammar and test file provided above. This is the most serious.
  2. If I use the construct nts? or nts* or nts+ on the rhs of a rule in the grammar, where nts is a defined non-terminal symbol defined elsewhere, I get the error "Undefined rule" with no indication of what the rule is, or how or why it is was created as undefined. To fix it, I have to parenthesize the nts, e.g., (nts)?, on the right hand side of the rule being defined. Nowhere is it documented that nts? is not valid gbnf, and if it is, the validator should complain, rather than just producing an invalid grammar representation.
  3. The documentation for gbnf states that non-terminal symbols may consist of lower case letters and "-", e.g., "if-statement". If I use a camelCase nts, e.g., "ifStatement", it is accepted by the validator but the test file parser does not work properly (at least in some cases). So is a camelCase nts allowed? If it isn't, the validator should complain and state the bad nts it sees, rather than just producing an invalid grammar representation.
  4. The gbnf grammar seems to require that I sprinkle ws (whitespace) non-terminal symbols in seemingly random places throughout the grammar rules because there is apparently no notion of a lexical analyzer. This requires a very tedious trial-and-error process, because if I put unnecessary ws tokens in a rule, it prevents the rule from firing, and if I leave a necessary one out, it also prevents the rule from firing! If you cannot make it work like other bnf grammars, please at least document when one must add ws and when must not.

Name and Version

% ./llama-cli --version
version: 4075 (fb4a0ec)
built with Apple clang version 16.0.0 (clang-1600.0.26.4) for arm64-apple-darwin23.6.0

What operating system are you seeing the problem on?

Mac

Relevant log output

tiny_malloc_from_free_list 0x00000001838043f0
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:729
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
................ 65,000 lines removed ...............
llama_grammar_advance_stack(const std::vector<> &, const std::vector<> &, std::vector<> &) llama-grammar.cpp:731
llama_grammar_accept(const std::vector<> &, const std::vector<> &, unsigned int, std::vector<> &) llama-grammar.cpp:864
[Inlined] llama_grammar_validate(llama_grammar *, const std::string &, unsigned long &, std::string &) gbnf-validator.cpp:21
main gbnf-validator.cpp:101
@nissenbenyitskhak nissenbenyitskhak added bug-unconfirmed critical severity Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss) labels Nov 15, 2024
@nissenbenyitskhak
Copy link
Author

Meant to say: the line numbers in the stack trace may be off a bit from the version on GitHub. They correspond to my edited version with the "print" statements, which is attached.

@HanClinto
Copy link
Collaborator

Thank you for the report and the test case!

@HanClinto HanClinto self-assigned this Dec 3, 2024
@HanClinto HanClinto added bug Something isn't working and removed bug-unconfirmed labels Dec 4, 2024
@HanClinto
Copy link
Collaborator

HanClinto commented Dec 4, 2024

  1. The documentation for gbnf states that non-terminal symbols may consist of lower case letters and "-", e.g., "if-statement". If I use a camelCase nts, e.g., "ifStatement", it is accepted by the validator but the test file parser does not work properly (at least in some cases). So is a camelCase nts allowed? If it isn't, the validator should complain and state the bad nts it sees, rather than just producing an invalid grammar representation.

Note that this bug appears related to #7720, but I don't understand the bit about it not working properly in some cases. Best I can tell, the parser doesn't care about upper-case / lower-case, but something might have changed since I last looked at it.

@nissenbenyitskhak
Copy link
Author

nissenbenyitskhak commented Dec 4, 2024 via email

@nissenbenyitskhak
Copy link
Author

nissenbenyitskhak commented Dec 4, 2024

Please focus on the primary bug, 1. There is an infinite loop when the grammar (which passes the verification) is used to parse a simple program. There could possibly be something wrong with my grammar, but I don't see it, and the verifier doesn't see it either. And there is also something wrong with the logic in parsing code. It makes some assumptions that are not valid, thus doing an infinite regress.

Since the validator apparently does not do a thorough job of checking the grammar, what do you suggest that I need to do to check it?

@nissenbenyitskhak
Copy link
Author

@HanClinto Please respond. Thanks.

@HanClinto
Copy link
Collaborator

HanClinto wrote: "Does the bug occur with the release version, how is one to know if the bug is in your modifications." Answer: How is one to know? Because my only modification is adding print statements... The bug occurs with the original released version as well.

I don't remember writing that note, but the extent of your modifications are clear to me, thanks.

Please focus on the primary bug, 1.

My comment was to cross-reference your third point to a previously logged bug in our tracker, so that anyone looking at the other issue could be aware of discussion that is happening here, and vice-versa. Please don't take it to mean that I was ignoring your primary issue.

There is an infinite loop when the grammar (which passes the verification) is used to parse a simple program. There could possibly be something wrong with my grammar, but I don't see it, and the verifier doesn't see it either. And there is also something wrong with the logic in parsing code. It makes some assumptions that are not valid, thus doing an infinite regress.

Since the validator apparently does not do a thorough job of checking the grammar, what do you suggest that I need to do to check it?

I have applied your changes and successfully run them, but I have not yet dug into your GBNF file to debug what is going on. I haven't forgotten about this issue, but as I'm currently in the middle of changing jobs, things have been a bit busier than normal the past several weeks.

Glancing through your file, and it really is quite impressively large! It's clear you've put a lot of effort into this, and I'm impressed. I can understand why you are having a difficult time validating it manually. Are you using this grammar for anything other than LLM generation, or are you intending to use this as a formal grammar in another context as well?

This is not a huge issue, but one thing to note:

any-character ::= [\000-\FFF]

Support for . was added in #6467. Functionally, this won't behave any differently than [\\000-\\FFF] (and it will parse only slightly faster), but it's something that caught my eye when reading your grammar.

Something that I'm curious about is how well a different parser would do with this sort of grammar. It might be possible to adapt this grammar to the llguidance engine, which has some promising advantages over our default gbnf engine. That feature is not yet merged into the trunk, though you can check out #10224 if you wanted to experiment with it. I'm not positive, but I'm optimistic that it might work better for a grammar of this size.

My next step is to run your grammar through the parser and double check that it's linking everything correctly, and try to figure out where the infinite loop is happening that you're referencing. We've certainly had trouble with some very large stacks and out-of-memory exceptions with our current parsing structure. I've laid out some ideas for how we might be able to optimize this, but it's a very difficult problem that I haven't fully solved yet. Your test case will provide an excellent stress test, thank you!

@nissenbenyitskhak
Copy link
Author

nissenbenyitskhak commented Dec 17, 2024

To answer your questions:

  • Are you using the grammar for something other than LLM generation?
    Yes, sort of. I have it as an Antlr grammar for the Gosu language plugin in IntelliJ. I had to remove the Antlr annotations and rename all the NTS symbols to avoid camelCase, and things like that to get it to be accepted by the validator. I hit some other bugs and mysterious errors in the validator, such as "undefined rule" with no rule name given. I got past those. Then I moved on to trying to parse some example Gosu and figure out why it blew up, in various cases, using my debugging. But now that I've hit this infinite loop, I don't know quite what is wrong.

  • any-character ::= [\000-\FFF]: Support for . was added in Added support for . (any character) token in grammar engine. #6467
    I take it I can use . in the grammar in place of the NTS any-character?

  • adapt this grammar to the llguidance engine
    I need something I can present to ./llama-cli -m <path_to_your_model> -p "<partial-statement>" --grammar-file FILE as a test, and then FILE can be used for our Copilot to suggest Gosu code. If you have a more robust grammar mechanism that can be interpreted by the llguidance engine as part of Copilot, which I can test similarly to the above command, that would be great!

I'm glad my test case will help in your development! I have written grammar checkers and compilers before, and I see that your C++ implementation is trying to be very fast (rather than very understandable). Anyway, I couldn't quite understand it, especially given the lousy C++ IDE I'm using (CLion).

I would appreciate some estimates of a timeframe when you might be able to debug the infinite loop, or fix the validator, or merge the llguidance engine into the trunk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working critical severity Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)
Projects
None yet
Development

No branches or pull requests

2 participants