Escape reserved words during grammar generation (@parrt version) #3453

KvanTTT · 2022-01-02T17:29:57Z

Always escape x to x_, X to X_, XContext to X_Context (@parrt version, see #1070 (comment)). It does not relate to languages with reserved words escaping (C#, Swift).

Deprecate USE_OF_BAD_WORD

…arrt

…arrt version) Deprecate USE_OF_BAD_WORD

KvanTTT · 2022-01-02T18:27:42Z

@kaby76 or @studentmain could you provide updated lists of reserved words (both keywords and conflicting symbols) for all runtimes while I'm completing the fixes?

kaby76 · 2022-01-02T20:24:44Z

@kaby76 or @studentmain could you provide updated lists of reserved words (both keywords and conflicting symbols) for all runtimes while I'm completing the fixes?

As we have an undocumented requirement in grammars-v4 that an Antlr g4 should be target agnostic, the "bad word list" is merged for all targets, not a target-specific. There is no differentiation between keyword vs a conflicting symbol --all bad words are off limits regardless. The initial version was culled from the Antlr tool sources, discussed here. But, you already are modifying that code.

To get an updated, combined list, the easiest would be to use trxgrep and look for anything that ends in an underscore. But, that would be a merged list for all targets. I would have to write a grammar against each bad word, and test it against each of the targets, easily doable but I can't come up with that quickly.

KvanTTT · 2022-01-02T21:21:03Z

There is no differentiation between keyword vs a conflicting symbol --all bad words are off limits regardless

Differentiation is worth only for languages that support keywords escaping (C#, Swift). But if this info is unknown, a word is treated as a conflicting symbol.

To get an updated, combined list, the easiest would be to use trxgrep and look for anything that ends in an underscore. But, that would be a merged list for all targets. I would have to write a grammar against each bad word, and test it against each of the targets, easily doable but I can't come up with that quickly.

It would be great. No rush, especially since I haven't finished yet.

KvanTTT · 2022-01-03T12:53:07Z

I'm closing it in favor of #3451 Let's continue there.

kaby76 · 2022-01-03T12:58:06Z

Here is the aggregated list of "bad symbols" (with an underscore at the end).

all_underscore.txt

It was computed using the following script on a clean git clone of grammars-v4.

get-all-unames.txt

Escape reserved words during grammar generation, fixes antlr#1070 (@p…

bfc7907

…arrt version) Deprecate USE_OF_BAD_WORD

KvanTTT force-pushed the runtime-bad-words-escaping-2 branch from 7078fbe to bfc7907 Compare January 2, 2022 18:02

KvanTTT mentioned this pull request Jan 2, 2022

Have ANTLR4 prevent conflict with user rule names by behind-the-scenes renaming of its own variables #1070

Closed

KvanTTT closed this Jan 3, 2022

KvanTTT mentioned this pull request Jan 3, 2022

Escape bad words during grammar generation #3451

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Escape reserved words during grammar generation (@parrt version) #3453

Escape reserved words during grammar generation (@parrt version) #3453

KvanTTT commented Jan 2, 2022

KvanTTT commented Jan 2, 2022

kaby76 commented Jan 2, 2022 •

edited

Loading

KvanTTT commented Jan 2, 2022 •

edited

Loading

KvanTTT commented Jan 3, 2022

kaby76 commented Jan 3, 2022

Escape reserved words during grammar generation (@parrt version) #3453

Escape reserved words during grammar generation (@parrt version) #3453

Conversation

KvanTTT commented Jan 2, 2022

KvanTTT commented Jan 2, 2022

kaby76 commented Jan 2, 2022 • edited Loading

KvanTTT commented Jan 2, 2022 • edited Loading

KvanTTT commented Jan 3, 2022

kaby76 commented Jan 3, 2022

kaby76 commented Jan 2, 2022 •

edited

Loading

KvanTTT commented Jan 2, 2022 •

edited

Loading