Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verifying parser correctness #107

Open
darichey opened this issue Jul 22, 2022 · 7 comments
Open

Verifying parser correctness #107

darichey opened this issue Jul 22, 2022 · 7 comments

Comments

@darichey
Copy link
Contributor

darichey commented Jul 22, 2022

When making changes to the parser, I am always scared of regressions and properly handling edge cases in the Nix grammar. We want to ensure that rnix-parser parses Nix expressions correctly, but since Nix has no formal specification, the best we can ask in this regard is "does it do the same thing as the reference implementation?" I have built a tool to help us in answering that question: https://github.com/darichey/rnix-parser-tester.

There's a lengthy write-up if you're interested in the details, but the TLDR is you can feed it a Nix expression and it will tell you if rnix-parser and the reference implementation produced the same AST for that expression.

I have used it to verify the correctness of #96, #99, #101, #102, #103, and #106 by parsing all of nixpkgs. The tool provides a way to summarize the comparison results:

$ cargo run -- summary summary/f759fce.json summary/with-patches.json
...
== Summary ==
# equal before: 15924
# not equal before: 8135
# reference impl errors before: 1
# rnix-parser errors before: 1

# equal after: 24060
# not equal after: 0
# reference impl errors after: 1
# rnix-parser errors after: 0

# progressions: 8135
# regressions: 0

This tells us that before applying the patches, 8135 files in nixpkgs were parsed differently by rnix-parser. After applying the patches (and another one I'm still working on), all valid Nix files are parsed equivalently. (There are several caveats to these results which I cover here)

I wanted to make other contributors aware of the tool in case they would find it useful, and also ask if there would be any interest in somehow incorporating it into rnix-parser's testing (probably through CI). Thanks!

@oberblastmeister
Copy link
Contributor

I wonder if the new typed api would allow you to get rid of your custom ast type. The only difference would be that you would need lots of unwraps, but you already do those unwraps anyway when constructing the custom ast

@Ma27
Copy link
Member

Ma27 commented Jul 23, 2022

Very nice, thank you! :)

@oberblastmeister
Copy link
Contributor

By the way @darichey please rebase those prs so I can merge them!

@ncfavier
Copy link
Member

Maybe a silly question but: why don't we just expose the Nix parser as a library and use that in rnix-parser?

@darichey
Copy link
Contributor Author

darichey commented Jul 27, 2022

Do you mean as opposed to re-implementing the parser as has been done in rnix-parser? The primary reason (I would assume) is the desire for a fault-tolerant parser. It's very difficult to build nice tooling, especially editor tooling like rnix-lsp, around a parser which bails at any sign of trouble (which the reference Nix parser does, while rnix-parser does not)

@darichey
Copy link
Contributor Author

I wonder if the new typed api would allow you to get rid of your custom ast type. The only difference would be that you would need lots of unwraps, but you already do those unwraps anyway when constructing the custom ast

Yeah, the new typed api is much nicer to use, but really has the same main usability problem for this use case: I'm only interested in grammatically-valid Nix, so all the Options throughout the tree are just noise. I thought about trying to augment the nodes macros to also generate an error-less AST, but I don't know if it would work or if that use case is common enough to warrant any changes here (I doubt it).

@fogti
Copy link
Contributor

fogti commented Jul 29, 2022

It's a bit unfortunate that rust doesn't allow GATs yet, which could make the "maybe we want Options" ergonomic, and without code duplication (macro-based code duplication still incurs a compile-time penalty)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants