Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape reserved words during grammar generation (@parrt version) #3453

Closed
wants to merge 1 commit into from

Conversation

KvanTTT
Copy link
Member

@KvanTTT KvanTTT commented Jan 2, 2022

fixes #1070

Always escape x to x_, X to X_, XContext to X_Context (@parrt version, see #1070 (comment)). It does not relate to languages with reserved words escaping (C#, Swift).

Deprecate USE_OF_BAD_WORD

@KvanTTT KvanTTT force-pushed the runtime-bad-words-escaping-2 branch from 7078fbe to bfc7907 Compare January 2, 2022 18:02
@KvanTTT
Copy link
Member Author

KvanTTT commented Jan 2, 2022

@kaby76 or @studentmain could you provide updated lists of reserved words (both keywords and conflicting symbols) for all runtimes while I'm completing the fixes?

@kaby76
Copy link
Contributor

kaby76 commented Jan 2, 2022

@kaby76 or @studentmain could you provide updated lists of reserved words (both keywords and conflicting symbols) for all runtimes while I'm completing the fixes?

As we have an undocumented requirement in grammars-v4 that an Antlr g4 should be target agnostic, the "bad word list" is merged for all targets, not a target-specific. There is no differentiation between keyword vs a conflicting symbol --all bad words are off limits regardless. The initial version was culled from the Antlr tool sources, discussed here. But, you already are modifying that code.

To get an updated, combined list, the easiest would be to use trxgrep and look for anything that ends in an underscore. But, that would be a merged list for all targets. I would have to write a grammar against each bad word, and test it against each of the targets, easily doable but I can't come up with that quickly.

@KvanTTT
Copy link
Member Author

KvanTTT commented Jan 2, 2022

There is no differentiation between keyword vs a conflicting symbol --all bad words are off limits regardless

Differentiation is worth only for languages that support keywords escaping (C#, Swift). But if this info is unknown, a word is treated as a conflicting symbol.

To get an updated, combined list, the easiest would be to use trxgrep and look for anything that ends in an underscore. But, that would be a merged list for all targets. I would have to write a grammar against each bad word, and test it against each of the targets, easily doable but I can't come up with that quickly.

It would be great. No rush, especially since I haven't finished yet.

@KvanTTT
Copy link
Member Author

KvanTTT commented Jan 3, 2022

I'm closing it in favor of #3451 Let's continue there.

@KvanTTT KvanTTT closed this Jan 3, 2022
@kaby76
Copy link
Contributor

kaby76 commented Jan 3, 2022

Here is the aggregated list of "bad symbols" (with an underscore at the end).

all_underscore.txt

It was computed using the following script on a clean git clone of grammars-v4.

get-all-unames.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Have ANTLR4 prevent conflict with user rule names by behind-the-scenes renaming of its own variables
2 participants