-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Escapes sequence recognition failure in character sets #1537
Comments
Yeah, basically it's saying now that the escape is unnecessary. I didn't realize the PR made it an error rather than warning. sorry about that. Please do create a PR for grammars-v4! :) |
Does it mean that vertical tabs are not allowed anymore (i.e., they should be removed)? And a second thing, grammars-v4 still uses 4.5.3. Will it be bumped up? |
Hmm...well, \v can be done with \u000B so let's leave it. it's rare. Yeah, we should bump to 4.6 |
It seems to me that this pull request #1517 brought such behavior. In my opinion these cases are not clear. So, @renatahodovan make pull request to grammars if you want :) |
@renatahodovan we could add |
@parrt yes, I think it would be useful for the sake of backward compatibility. Thanks! |
Yes, warning instead of error is a good idea. I'll try to fix these issues at the beginning of January. |
Any pull request for this coming? If not, I can do it. |
Not ready for now :( |
Just to help on testing.
fails on 4.6 but not on 4.5.3 |
@Nulleye thank you for feedback! |
Sorry, I realized that the web form has eaten some backslash chars. |
@Nulleye i added the tick marks to make it appear as code. thanks! |
@Nulleye You don't need to escape the
|
@KvanTTT can you take a look at this now that we've pulled in the unicode 32 stuff? |
OK, I'll take a look. |
I think we should not support it because of inconsistency. In 4.5.3 we can not write Since 4.6 version the |
Ok, since 4.6 will not allow, let's keep it as-is. no |
Moreover, I suggest just to close this issue. In 4.5.3 version we can not use backslash inside quote literals. Since 4.6 version we can not use backslash inside square-bracket blocks too. So, since ANTLR 4.6 escape chars processing is consistent. See also testValidEscapeSequences test. |
Ok, will close. Can we change the errors to be warnings (I think you made them errors) though that broke all the grammars-v4 grammars? |
Warnings for square-bracket block or for both syntax? It's easier to update our grammars :) |
Well people over-escaped previously which caused lots of failures to build as it became an error rather than warning (or was previously just ignored). Could be lots of old 4.5.3 grammars out there that did |
…. This is related to antlr#1537. All tool errors pass now.
@parrt what should we do with such char sets?
Possible solutions:
I think the second choice is better. |
yeah, an error is best choice. can we reuse a previous error? |
I'm afraid but I didn't find a corresponding type for such error. |
dang. ok, maybe create another error maybe error code 165 INVALID_SET? |
I agree. I'll fix it tomorrow. |
Hello, along similar lines, I've got the following escape sequences to consider:
I know how I might approach it using something like Boost.Spirit.Qi, with: hex_esc %= no_case["\\x"] >> uint_parser<unsigned char, 16, 2, 2>{};
oct_esc %= '\\' >> uint_parser<unsigned char, 8, 3, 3>{};
// The last bit in this phrase is literally, "Or Any Characters Not in the Sequence".
char_val %= hex_esc | oct_esc | char_esc | ~char_("\0\n\\");
str_lit %= ("'" >> *(char_val - "'") >> "'")
| ('"' >> *(char_val - '"') >> '"')
; And the escape sequences: struct escapes_t : qi::symbols<char, char> {
escapes_t() {
this->add("\\a", '\a')
("\\b", '\b')
("\\f", '\f')
("\\n", '\n')
("\\r", '\r')
("\\t", '\t')
("\\v", '\v')
("\\\\", '\\')
("\\'", '\'')
("\\\"", '"')
;
}
} char_esc; Curious how that might flow in ANTLR4 targeting C#. |
@parrt wrote:
It seem that this is not what has been implemented in the end. I tried this on 4.9:
This makes ANTLR crash:
See the discussion at BNFC/bnfc#329. Should I open a new |
hi. at this point I'm not doing a lot of fixes but it seems reasonable to prevent the tool from failing given |
No hurry. |
Now ANTLR reports |
After the 4.6 release, the behaviour of escape sequences in the lexer's character sets changed compared to 4.5.3. I'm not sure whether they are bugs or features but I think it's worth to mention:
[\[]
this worked in 4.5.3 but in 4.6 it's an invalid escape sequence (instead[[]
can be used)[\v]
vertical tabs worked in 4.5.3 but they are invalid in 4.6[\j\k\l]
but now they are invalid.[d-a]
: although its a bit weird, but was valid in 4.5.3, but it's not in 4.6If these changes are expected, then I'm going to create some PR-s in grammars-v4, since several of them are failing now.
The text was updated successfully, but these errors were encountered: