Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect lexical ambiguity due to regex overlap #4271

Open
Scott-Guest opened this issue Apr 23, 2024 · 4 comments
Open

Detect lexical ambiguity due to regex overlap #4271

Scott-Guest opened this issue Apr 23, 2024 · 4 comments
Assignees

Comments

@Scott-Guest
Copy link
Contributor

If two regex terminals overlap and have the same prec(_) value, their precedence in the lexer is flaky and dependent on set iteration order (see #4255).

We should detect such ambiguities, either during every kompilation if it can be done cheaply, or with a dedicated -Wregex-overlap flag.

@Scott-Guest Scott-Guest self-assigned this Apr 23, 2024
@Scott-Guest
Copy link
Contributor Author

Scott-Guest commented Apr 23, 2024

Blocked on #4266. If we decide to allow negative lookbehind/lookahead, this becomes undecidable.

@Scott-Guest
Copy link
Contributor Author

Scott-Guest commented Apr 30, 2024

Blocked on #4295. We need a structured representation in order to translate to the dk.brics.automaton syntax.

@Scott-Guest
Copy link
Contributor Author

Scott-Guest commented May 23, 2024

Also consider non-regular expression tokens. We may need to add prec attributes to tokens as well then

@Scott-Guest
Copy link
Contributor Author

Also consider user list separators

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant