Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: panic-mode error recovery #71

Open
shivansh opened this issue May 6, 2018 · 7 comments
Open

Proposal: panic-mode error recovery #71

shivansh opened this issue May 6, 2018 · 7 comments

Comments

@shivansh
Copy link
Collaborator

shivansh commented May 6, 2018

As far as I understand, gocc halts after reporting the first encountered error. Using panic-mode error recovery it will be able to report more than one error at a time, thus avoiding the need to do multiple compiles in case more than one error exists.

Implementation details

In case an invalid state is encountered, following steps are taken in order to recover -

  • The stack is scanned until a state s with goto on a particular nonterminal A is found.

  • Zero or more input symbols are discarded until a symbol is found which belongs to follow(A).

  • In case there is more than one choice of nonterminal A, one of them is chosen randomly. This might lead to poor error reports with each recovery.

  • The parser pushes the state GOTO(s, A) on the stack and resumes normal parsing.

  • The overall number of recoveries can be capped to some specific value (8?), as the error reports after a while might not be useful at all.

This mode can be provided as an option, so that the user can choose whether to use the currently implemented mode (exit on error) or the proposed recovery mode.

In case the above is agreeable, I'm happy to prepare a PR which implements panic-mode error recovery.

@awalterschulze
Copy link
Collaborator

Very interesting :)
I am a bit afraid to give a yes at this point.
I would want to see how it works, with examples.
But I am very intrigued.

I am a bit worried about the random element. I would prefer if some deterministic choice was made, that would make it at least somewhat possible for the user to reason about.

I don't know how much work it is, but maybe it is something that you would want to attempt in a fork and then show us how a proof of concept is working, with examples?

@mewmew what do you think?

@shivansh
Copy link
Collaborator Author

shivansh commented May 6, 2018

I would want to see how it works, with examples.

@awalterschulze Sure. I'll try things first and demonstrate if it works out.

I am a bit worried about the random element. I would prefer if some deterministic choice was made, that would make it at least somewhat possible for the user to reason about.

Makes sense. So instead of choosing randomly, we'll lookup the goto table for state s, choose the first nonterminal from the list of valid ones and proceed. This will make things deterministic.

I don't know how much work it is, but maybe it is something that you would want to attempt in a fork and then show us how a proof of concept is working, with examples?

Sure!

@shivansh
Copy link
Collaborator Author

shivansh commented May 6, 2018

Computing follow sets

For proceeding with the error recovery, computation of follow sets for all the nonterminals will be required. I don't think this has been implemented in gocc yet, so I guess I'll go ahead and implement it unless there is an existing alternative approach.

Also, I did come across FollowingSymbol.

@shivansh
Copy link
Collaborator Author

shivansh commented May 8, 2018

Update

This commit describes a very simple demo using the calc example, and also the final problem [1] [2] (details in the commit message). Hopefully I'll figure this out soon.

@awalterschulze
Copy link
Collaborator

How is this different from the error recovery mode that is already a feature of gocc?
In that error recovery mode you need to provide hints in the bnf about where errors can occur, whereas this is more automatic.
Or am I totally missing the point, because I haven't done these things recently enough :(

@shivansh
Copy link
Collaborator Author

shivansh commented May 9, 2018

In that error recovery mode you need to provide hints in the bnf about where errors can occur, whereas this is more automatic.

Yep, the difference in how the two modes behave is that the former (in gocc) requires hints in form of modifications to production rules, notifying which ones can recover. In panic mode, the recovery is automatic. Also, it is highly probable that the recovery will succeed in most of the cases.
In terms of implementation the two modes vary significantly, the latter depending on data from GOTO table and follow sets.

Concerning the error recovery mode implemented in gocc, I think I've figured out why the b in the input a b ; d e f in error recovery example went missing (it was marked TBD in the docs). I'll discuss that in a separate issue/PR, just need to formalize my arguments a bit.

@awalterschulze
Copy link
Collaborator

Ooo if you could figure out why the b went missing, that would be super valuable :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants