-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support left recursion #123
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed of this PR and I am by far not through yet (especially, I have not yet started to look into the parts for the left recursion handling). I have some general questions though:
- Can you please provide a general summary, how the left recursion handling is implemented.
- Why do all the generated parsers change even if the flag
support-left-recursion
is not set totrue
? Can you please explain the changes. - How does this change impact the performance of the generated parsers (optimized and not optimized). A good example for this is the json parser in
examples/json
. - Can you please check how the documentation would need to be changed (e.g.
README.md
,doc.go
)
Additionally, I left some minor initial comments.
For the left recursion rule, we apply it in a loop, while memoizing the result of its last execution. For the first time, we consider that the rule was not executed successfully. We stop when we cannot get a longer result. It is unlikely that I can describe it better than Guido van Rossum:
Priority support is implemented by saving rules and expressions in the stack. Therefore, at the stage of the new choice, we can check what choice we made last time. |
I tweaked the code a bit, now there are fewer changes. The remaining changes are needed to more conveniently integrate support for left recursion. Major change:
->
|
Performance decreased by ~2% on examples/json
|
Added to doc.go, main.go |
If I understand this correctly, left recursion support is only possible, if memoization is available. Of so, the flags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now went through the complete PR and I have to admit, that from this first complete read, I am not able to fully follow how it works. This makes my currently feel somewhat unsure on how to proceed with this. In the end, I need to be confident in this code in order to keep maintaining it in the future.
Additional review comment:
- I think
map[...]bool
can be replaced everywhere withmap[...]struct{}
, which is more memory efficient.
@sc07kvm One more idea to gain some insights into the robustness of this implementation as well as coverage of edge cases would be to have two grammars for parsing the same input (e.g. arithmetic expressions), one with left recursion and one without and then use fuzzing to compare the results of the two. An other angle, that is worrying me a little bit is, that at the moment we do not have grammar examples combining left recursion with some of the other features like throw-recover and state so it is currently unknown, if this implementation does behave as expected in these cases. |
Replace map[string]bool with map[string]struct{} Add msgs in tests Correct typos Add testify library license Fix flaky test
Conceptually yes, memoization is needed. But if the I added a test for this case: test/left_recursion/optimized |
I need some time to do this |
Added:
|
@sc07kvm I have not forgotten you nor this PR. I am currently abroad and therefore only seldom online. I will try to have a look again, but I can not promise anything. I am sorry for the inconvenience. |
@sc07kvm I am back at work again and I checked this PR again. Thanks for all the work you have put into this. Do you mind to give me a brief summary about the current state of this PR. Do you have some open points you are work on or is this PR ready from you point of view? |
In my opinion PR is ready. I did everything I wanted on it, there were no unanswered questions on it. The status is as follows - support for left recursion has been implemented, a check for the presence of left recursion has been implemented, tests have been written for the operation of left recursion with all other grammar features. |
Thanks for the summary, this all sounds very good. I went through the PR again and this is one of the biggest PR ever for pigeon (at least in the recent time) and therefore I am still a little bit concerned about eventual negative side effects. In order to get some feedback from the community, I would like to call out to all of you that follow this PR as well as to some power users of pigeon (@mna, @xcoulon, @flowchartsman) with the request to give this PR a spin and report back if for your use-cases everything still works as intended. Thank you very much. |
I just successfully tested
|
Hey @breml , thanks for reaching out, I understand your concerns as this is a huge PR. Thanks to @sc07kvm for the significant contribution! I personally haven't used pigeon in a long time and I'm afraid I don't have any additional use-cases to check outside the test suite. I think a valid question is whether or not left recursion is a big enough problem to add that complexity in the parser (vs fixing/adjusting the grammar)? I sincerely don't know the answer to that, left recursion has generally not been an issue in my grammars but then again they've been more or less fully under my control. And I haven't followed up on the previous discussions, so apologies if that has already been discussed at length (and no disrespect to the huge effort made to implement this! The answer may very well be that it's worth adding this, and I guess it's the likely answer at this point since there's been a good amount of back and forth from what I can see). Sorry I cannot be of more help testing this. |
@mna Thanks for your feedback, this is always appreciated.
While it is true, that the grammars can always rearranged such that there are no left recursions, it is also true, that this often comes with the cost, that the grammars become less natural to read and therefore harder to maintain. Therefore adding this complexity to As you have mentioned, we already put quite some effort in testing and refining this PR. I am seriously considering to add this, but before doing so, I would like to have a very high confidence, that this does not break existing code. |
Totally agree with you @breml .
Yeah that makes sense to me. The fact that it is generally used as a tool (and not imported as a package) makes it harder to automate some kind of "testing in the wild", as no importers are reported in pkg.go.dev. One thing I did that resulted in a reasonable number of repos using This could be a starting point to automate testing the changes on a wide variety of grammars. |
tl/dr: tests with left recursion pigeon: ✅ 17 ➖ 13 ❌ 0 Today I ran some tests with (popular) packages, that use pigeon. I used this search https://github.com/search?utf8=%E2%9C%93&q=generate+pigeon+peg+path%3A*.go&type=code to find packages, that have the keywords Then I basically worked with the following sequence:
This is based on the assumption, that there are some tests, that cover the correct function of the generated parser. These are the results: ✅ https://github.com/bytesparadise/libasciidoc - worked after updating the Go version in Result: Concusion: Any other thoughts? |
👍 Agree with the confidence boost that this provides. |
🚀 It is merged! Thank you very much @sc07kvm for your great effort to make this happen. ❤️ A new release will follow soon. |
Addresses issues mentioned in #2 necessary for testing mna/pigeon#123 - remove old go generate directive from package parser - bring test in line with newer error message - update to mna/pigeon@latest & regenerate parser.
This PR is the first step towards the implementation of #120.
Implementation details are peeped here https://github.com/we-like-parsers/pegen_experiments/tree/master/pegen
Fixes: #120
It also fixes #79