Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent parse results from simple ambiguous grammar #1434

Closed
mcarpenter opened this issue Jun 30, 2024 · 4 comments · Fixed by #1435
Closed

Inconsistent parse results from simple ambiguous grammar #1434

mcarpenter opened this issue Jun 30, 2024 · 4 comments · Fixed by #1435

Comments

@mcarpenter
Copy link

Describe the bug

I have a small grammar that is a starting point to parse regular expressions. When run repeatedly on the same input Lark produces two different parse trees.

There is an ambiguity in my toy grammar because a RE quantifier (*, +, ?) can also be interpreted as a printable_char (ASCII values between space and tilde). However my reading previous issues (#81, #83) indicates inconsistent parse results should be treated as a bug (and it was certainly a surprise).

To Reproduce

Lark 1.1.9, Python 3.10.12.

#!/usr/bin/env python

from lark import Lark

parser = Lark('''
start: factor+

?factor: atom quantifier?

quantifier: "*" | "+" | "?"

?atom: DOT | printable_char

printable_char: /[ -~]/

DOT: "."
''')

regex = r'a.?'
tree = parser.parse(regex)
print(tree.pretty())

Sample results

start
  printable_char	a
  .
  printable_char	?
start
  printable_char	a
  factor
    .
    quantifier
@MegaIng
Copy link
Member

MegaIng commented Jun 30, 2024

Does this still happen with the latest master? (pip install git+https://github.com/lark-parser/lark)

@mcarpenter
Copy link
Author

Yes, it does still happen. (pip list reports 1.2.0, I still get two different results).

I also see discussion #1288 and setting environment variable PYTHONHASHSEED=0 does indeed yield consistent results (the second of those presented above). Similarly increasing the priority to quantifier.2 also fixes it for me.

erezsh added a commit that referenced this issue Jun 30, 2024
erezsh added a commit that referenced this issue Jun 30, 2024
erezsh added a commit that referenced this issue Jun 30, 2024
@erezsh
Copy link
Member

erezsh commented Jun 30, 2024

@mcarpenter Try the latest master again. (you might need to uninstall lark first)

@mcarpenter
Copy link
Author

That got it, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants