[WIP] parsing fixes, better error handling, fuzzing #35

oberblastmeister · 2021-09-28T23:54:44Z

Currently the parser doesn't parse operator associations correctly, everything is left associative. This is wrong for operators like ++, ->, and //. I fixed it for those operators. I also incorporated #34. But this error handling is more general and not just for semicolons. It uses the concept of recovery which is taken from rust-analyzer. I added fuzzing for the lexer and the parser which addresses #32. I have fixed the parser to pass all the fuzz tests. I removed cbitset which seems like an unnecessary dependency because we don't need all the different bitset variants, only the largest. It is easy just to write one by hand and it is very short. I have put the bitset into use everywhere to check if a token is contained inside rather than using a slice which has to be looped over. My code is very messy right now and I don't have tests, but I will add them as I go.

aaronjanse · 2021-09-30T18:05:18Z

Whoa, this is very cool. Thank you @oberblastmeister!! I'll start trying this out locally while developing rnix-lsp :-)

SuperSandro2000 · 2021-11-01T15:19:55Z

fuzz/Cargo.toml

+name = "lexer"
+path = "fuzz_targets/lexer.rs"
+test = false
+doc = false


Missing final new line

SuperSandro2000 · 2021-11-01T15:19:59Z

fuzz/fuzz_targets/parser.rs

+        writeln!(handle, "Fuzzing {:?}\n\n", data).unwrap();
+        let _ = rnix::parse(text);
+    }
+});


Missing final new line

aaronjanse · 2021-11-05T17:30:19Z

src/parser.rs

@@ -411,6 +500,7 @@ where
                return self.builder.checkpoint();
            }
        };
+        // println!("peek is: {:?}", peek);


Would you mind removing these debugging lines?

Ma27 · 2021-11-12T13:58:31Z

So this is actually way too much to merge as-is IMHO.

I'm strongly in favor of applying the fuzzer-part, but I'm not sure about the rest currently.

oberblastmeister · 2021-11-15T14:35:55Z

Well I think that our parser actually being able to parse stuff correctly is pretty important :].

aaronjanse · 2021-11-15T23:03:40Z

Yep I agree, correct parsing is important. And improved error handling is necessary for providing autocomplete for rnix-lsp.

@oberblastmeister Would you mind splitting this into smaller PRs, such as maybe:

fuzzing
error handling
correct operation handling
dependency changes

oberblastmeister · 2021-11-16T01:38:15Z

Yeah I can split them up

aaronjanse · 2021-11-16T04:14:23Z

Thank you!!

Ma27 · 2021-11-16T09:47:43Z

Well I think that our parser actually being able to parse stuff correctly is pretty important :].

I have to apologize actually, my wording was rather poorly! What I meant was exactly what @aaronjanse meant (we briefly talked about it previously), so thanks a lot for your work on this, but splitting things up would actually help us to get things in! :)

mohe2015 · 2021-11-20T17:51:54Z

It uses the concept of recovery which is taken from rust-analyzer.

Does this mean the parser (I don't know the current state) will not be parsing in linear time then? Is the Nix Language not LL(k) parseable?

Ma27 · 2021-11-30T00:23:58Z

Re-assigning to 0.11.0. @oberblastmeister do you think you'll be able to get to splitting things up during the next weeks? I'm not sure when I'll get to it, but otherwise I'd resume here :)

Does this mean the parser (I don't know the current state) will not be parsing in linear time then? Is the Nix Language not LL(k) parseable?

Perhaps it is, but it doesn't seem desirable. Further context is in https://edolstra.github.io/pubs/phd-thesis.pdf (Section 4.2, page 64):

Most context-free parsers (e.g., LL(k)
and LR(k)parsers, for fixed k) suffer from bounded look-ahead: they must choose
between different productions on the basis of a fixed number of tokens. This is
inconvenient for language implementors, since they must deal with the resulting
parse conflicts. For instance, the input fragment "{x" could be start of an attribute set
(e.g., {x=123;}), or a function definition (e.g., {x}: x)1. This requires the programmer
to introduce additional production rules to resolve the conflict

oberblastmeister · 2021-11-30T13:38:46Z

This pr should not change the speed of the parser. It does not change what the parser does, only the error handling. The parser still does the same amount of lookaheads as before.

@oberblastmeister

port of @oberblastmeister's changes from nix-community#35

@oberblastmeister

port of @oberblastmeister's changes from nix-community#35

oberblastmeister · 2022-07-23T15:15:07Z

I'm closing this since this pr has been split up. The only thing left to do is error handling.

oberblastmeister added 6 commits September 28, 2021 15:24

start

3ba221d

more

89eb373

ok

7c83f3d

more

3a2efa9

more

22eb94c

assoc

6710b7f

oberblastmeister changed the title ~~[WIP] parsing fixes and better error handling~~ [WIP] parsing fixes, better error handling, fuzzing Sep 28, 2021

Ma27 requested review from Ma27 and aaronjanse October 8, 2021 11:38

Ma27 added this to the 0.10.0 milestone Oct 8, 2021

Ma27 added bug Something isn't working enhancement New feature or request labels Oct 8, 2021

oppiliappan mentioned this pull request Nov 1, 2021

Extra parenthesis are not removed when used with lib.optionals oppiliappan/statix#14

Open

SuperSandro2000 reviewed Nov 1, 2021

View reviewed changes

fuzz/Cargo.toml

name = "lexer"

path = "fuzz_targets/lexer.rs"

test = false

doc = false

Copy link

Member

SuperSandro2000 Nov 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing final new line

SuperSandro2000 reviewed Nov 1, 2021

View reviewed changes

aaronjanse reviewed Nov 5, 2021

View reviewed changes

Ma27 mentioned this pull request Nov 30, 2021

better typed api, update rowan, remove SmolStr #36

Closed

Ma27 modified the milestones: 0.10.0, 0.11.0 Nov 30, 2021

oppiliappan added a commit to oppiliappan/rnix-parser that referenced this pull request Feb 13, 2022

handle operator associavity

87d8f4f

port of @oberblastmeister's changes from nix-community#35

oppiliappan mentioned this pull request Feb 13, 2022

handle operator associavity #75

Merged

darichey pushed a commit to darichey/rnix-parser that referenced this pull request Jun 15, 2022

handle operator associavity

a460c17

port of @oberblastmeister's changes from nix-community#35

oberblastmeister closed this Jul 23, 2022

oberblastmeister mentioned this pull request Jul 27, 2022

handle syntax errors more gracefully #34

Closed

aaronjanse mentioned this pull request Jul 28, 2022

improved error handling #120

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] parsing fixes, better error handling, fuzzing #35

[WIP] parsing fixes, better error handling, fuzzing #35

oberblastmeister commented Sep 28, 2021 •

edited

Loading

aaronjanse commented Sep 30, 2021

SuperSandro2000 Nov 1, 2021

SuperSandro2000 Nov 1, 2021

aaronjanse Nov 5, 2021

Ma27 commented Nov 12, 2021

oberblastmeister commented Nov 15, 2021

aaronjanse commented Nov 15, 2021

oberblastmeister commented Nov 16, 2021

aaronjanse commented Nov 16, 2021

Ma27 commented Nov 16, 2021

mohe2015 commented Nov 20, 2021

Ma27 commented Nov 30, 2021

oberblastmeister commented Nov 30, 2021

oberblastmeister commented Jul 23, 2022

[WIP] parsing fixes, better error handling, fuzzing #35

[WIP] parsing fixes, better error handling, fuzzing #35

Conversation

oberblastmeister commented Sep 28, 2021 • edited Loading

aaronjanse commented Sep 30, 2021

SuperSandro2000 Nov 1, 2021

Choose a reason for hiding this comment

SuperSandro2000 Nov 1, 2021

Choose a reason for hiding this comment

aaronjanse Nov 5, 2021

Choose a reason for hiding this comment

Ma27 commented Nov 12, 2021

oberblastmeister commented Nov 15, 2021

aaronjanse commented Nov 15, 2021

oberblastmeister commented Nov 16, 2021

aaronjanse commented Nov 16, 2021

Ma27 commented Nov 16, 2021

mohe2015 commented Nov 20, 2021

Ma27 commented Nov 30, 2021

oberblastmeister commented Nov 30, 2021

oberblastmeister commented Jul 23, 2022

oberblastmeister commented Sep 28, 2021 •

edited

Loading