-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Grammar readme seems incorrect #7720
Comments
@thekevinscott Thank you for the report! I just wanted to write and confirm that your report is indeed correct. Looking through the code, the set of valid identifiers is much larger, and includes anything in the realm of a-zA-Z0-9, and "-". This means that we can have identifiers that are all numbers, pure dashes, identifiers that have the same name but are different cases ("upper-lower-case" and "UPPER-LOWER-CASE" are distinct from each other), etc. I built an integration test that compiles and runs correctly -- this is a valid grammar (according to the engine) that seems to stretch the rules, and certainly violate the documentation (as you noted): root ::= simple-identifier
simple-identifier ::= ---dashmaster--- | mIxEd-CaSe | UPPER-LOWER-CASE | upper-lower-case | 12345
---dashmaster--- ::= [-_]+
mIxEd-CaSe ::= [a-z][A-Z][a-z][A-Z]
UPPER-LOWER-CASE ::= [A-Z][A-Z]
upper-lower-case ::= [a-z][a-z]
12345 ::= "67890" @ochafik / @ggerganov -- feels like a minor issue, so I feel a little bad pinging y'all on this one, but I also struggle with direction-level decisions on my own. Do either of you have opinions? Should we:
|
As far as I can tell, that line of documentation was authored by @ejones - maybe they have some insight into the original intent? My initial instinct would be to tighten the rules, in case the parser needs to be refactored in the future. In the examples you provided On the other hand, if it doesn't cause any parsing issues with the current state, maybe it's worth leaving as is and just updating the documentation, particularly since your integration test above now explicitly tests for these edge cases. |
What happened?
This bit on the grammar readme states:
However, the
c.gbnf
defines adataType
rule (which features a non-lower-case letter) and this grammar appears to be valid.I'm not sure what the intended behavior is. I would be happy to update the README if uppercase variables are supported.
Name and Version
N/A
What operating system are you seeing the problem on?
No response
Relevant log output
No response
The text was updated successfully, but these errors were encountered: