-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] parsing fixes, better error handling, fuzzing #35
Conversation
Whoa, this is very cool. Thank you @oberblastmeister!! I'll start trying this out locally while developing rnix-lsp :-) |
name = "lexer" | ||
path = "fuzz_targets/lexer.rs" | ||
test = false | ||
doc = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing final new line
writeln!(handle, "Fuzzing {:?}\n\n", data).unwrap(); | ||
let _ = rnix::parse(text); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing final new line
@@ -411,6 +500,7 @@ where | |||
return self.builder.checkpoint(); | |||
} | |||
}; | |||
// println!("peek is: {:?}", peek); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind removing these debugging lines?
So this is actually way too much to merge as-is IMHO. I'm strongly in favor of applying the fuzzer-part, but I'm not sure about the rest currently. |
Well I think that our parser actually being able to parse stuff correctly is pretty important :]. |
Yep I agree, correct parsing is important. And improved error handling is necessary for providing autocomplete for rnix-lsp. @oberblastmeister Would you mind splitting this into smaller PRs, such as maybe:
|
Yeah I can split them up |
Thank you!! |
I have to apologize actually, my wording was rather poorly! What I meant was exactly what @aaronjanse meant (we briefly talked about it previously), so thanks a lot for your work on this, but splitting things up would actually help us to get things in! :) |
Does this mean the parser (I don't know the current state) will not be parsing in linear time then? Is the Nix Language not LL(k) parseable? |
Re-assigning to 0.11.0. @oberblastmeister do you think you'll be able to get to splitting things up during the next weeks? I'm not sure when I'll get to it, but otherwise I'd resume here :)
Perhaps it is, but it doesn't seem desirable. Further context is in https://edolstra.github.io/pubs/phd-thesis.pdf (Section 4.2, page 64):
|
This pr should not change the speed of the parser. It does not change what the parser does, only the error handling. The parser still does the same amount of lookaheads as before. |
port of @oberblastmeister's changes from nix-community#35
port of @oberblastmeister's changes from nix-community#35
I'm closing this since this pr has been split up. The only thing left to do is error handling. |
Currently the parser doesn't parse operator associations correctly, everything is left associative. This is wrong for operators like
++
,->
, and//
. I fixed it for those operators. I also incorporated #34. But this error handling is more general and not just for semicolons. It uses the concept of recovery which is taken from rust-analyzer. I added fuzzing for the lexer and the parser which addresses #32. I have fixed the parser to pass all the fuzz tests. I removedcbitset
which seems like an unnecessary dependency because we don't need all the different bitset variants, only the largest. It is easy just to write one by hand and it is very short. I have put the bitset into use everywhere to check if a token is contained inside rather than using a slice which has to be looped over. My code is very messy right now and I don't have tests, but I will add them as I go.