Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework and fix Attrpath and Dynamic parsing #106

Merged
merged 5 commits into from
Jul 23, 2022

Conversation

oxalica
Copy link
Contributor

@oxalica oxalica commented Jul 19, 2022

Summary & Motivation

Rework and fix Attrpath and Dynamic parsing

Current handling of Attrpath and Dynamic ${ is mostly incorrect.

Backwards-incompatible changes

  • NODE_OR_DEFAULT is removed
  • NODE_KEY -> NODE_ATTRPATH, NODE_KEY_VALUE -> NODE_ATTRPATH_VALUE
  • BinOp::IsSet is removed since it's actually not a normal binary operator.
  • NODE_SELECT is flatten in favor of NODE_ATTRPATH which consists of multiple Attrs, including a.b.c and a.b.c or expr
  • a or b is Apply, not "OrDefault", which matches the result of Nix. Fixes Doesn't parse valid Nix edge-cases #23
  • ${ is considered invalid at Expr places now, which matches the result of Nix.

Further context

The name of NODE_KEY is weird to me. It's called attrpath in Nix and in nixpkgs.
Should we rename it? This would also cause API breakage.

Renamed.

Copy link
Contributor

@darichey darichey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! This has been on my todo list for a bit :)

The name of NODE_KEY is weird to me. It's called attrpath in Nix and in nixpkgs.
Should we rename it?

+1 to renaming

a or b is Apply, not "OrDefault", which matches the result of Nix.

Does this fix #23?

src/types.rs Outdated Show resolved Hide resolved
src/kinds.rs Outdated Show resolved Hide resolved
@oxalica
Copy link
Contributor Author

oxalica commented Jul 20, 2022

Does this fix #23?

Yes. I referenced it in the description now.

@oxalica
Copy link
Contributor Author

oxalica commented Jul 20, 2022

The name of NODE_KEY is weird to me. It's called attrpath in Nix and in nixpkgs.
Should we rename it?

+1 to renaming

I'd prefer to keep this PR focused on parsing fixes, and leave the renaming to another PR later. Maybe after the refactoring of high-level AST in #105.

Copy link
Contributor

@darichey darichey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last thing from me: Theimpl TryFrom<SyntaxNode> for ParsedType should be updated to account for ParsedType::HasAttr as well.

src/types.rs Outdated Show resolved Hide resolved
src/parser.rs Show resolved Hide resolved
@oberblastmeister
Copy link
Contributor

@oxalica This looks good. I'll merge after it's rebased

Current handling of Attrpath and Dynamic `${` is mostly incorrect.

- The dynamic attribute `${` can only appear as an Attr inside an Attrpath,
  and itself cannot make up an expression.

  But currently we parse it as a kind of Expr.

  This commit reject this usage.

- Dot separated attributes, aka. Attrpath or Key, can appear in 3 places:

  1. After a "." in a Select expression
     `<expression>.<attrpath>`

     https://github.com/NixOS/nix/blob/2.10.3/src/libexpr/parser.y#L439

  2. After a "?" in a HasAttr expression
     `<expression> ? <attrpath>`

     https://github.com/NixOS/nix/blob/2.10.3/src/libexpr/parser.y#L416

  3. As a binding key in Let or Attrset-like expression
     `<attrpath> = <expression>;`

     https://github.com/NixOS/nix/blob/2.10.3/src/libexpr/parser.y#L541

  To make it consistent, all usage above are now wrapped in a
  NODE_ATTRPATH,
  which consists of multiple {NODE_DYNAMIC,NODE_IDENT,NODE_STRING}
  interspersed by TOKEN_DOT.

  NODE_KEY_VALUE is also renamed to NODE_ATTRPATH_VALUE.

  Benefiting from flat Attrpath instead of nested one, we can now get
  rid of NODE_OR_DEFAULT, since it can be trivially distinguished by
  checking if there is an `or` in NODE_ATTRPATH_VALUE.
@oberblastmeister
Copy link
Contributor

This or business is incredibly sus (in the nix language).

@oberblastmeister
Copy link
Contributor

It looks like we are parsing or wrong. In the nix parser or is only an identifier when it is used in an apply expression. In our parser or becomes an identifier when it is not used directly after an attrpath in a select expression which is too liberal.

@oberblastmeister
Copy link
Contributor

  1. This pr already does lots of good improvements, so the issue might not be in the scope of this pr.
  2. It might be good to be more liberal, and do a separate pass to warn on using or incorrectly. This would be better for error handling because one incorrect or won't trigger a bunch of parse errors (because of recovery), and will only trigger one precise dedicated error.

@oxalica
Copy link
Contributor Author

oxalica commented Jul 23, 2022

It looks like we are parsing or wrong.

Yes, it's still not perfect but maybe it's kind of off-topic for this PR.

This or business is incredibly sus (in the nix language).
In the nix parser or is only an identifier when it is used in an apply expression.

No "only". It's also valid in attrpath let or = 1; in .... Not any "apply expression", it's only valid for the direct RHS of Apply.

  • or 1 -> no parse
  • 1 or -> ok

And the "fallback to ident" takes precedence over left-associative of Apply. 🤯

  • fold or false -> (fold or) false
  • fold false or -> foldr (false or)
  • fold or or -> no parse

@oberblastmeister
Copy link
Contributor

No "only". It's also valid in attrpath let or = 1; in ....

I was only highlighting the differences, we obviously already parse this correctly

No "only". It's also valid in attrpath let or = 1; in .... Not any "apply expression", it's only valid for the direct RHS of Apply.

Thanks bro

And the "fallback to ident" takes precedence over left-associative of Apply.

Yeah unfortunate. That's what happens when you put call inside of inside of expr_select, too high precedence!

@@ -7,6 +7,7 @@ macro_rules! T {
(in) => ($crate::SyntaxKind::TOKEN_IN);
(inherit) => ($crate::SyntaxKind::TOKEN_INHERIT);
(let) => ($crate::SyntaxKind::TOKEN_LET);
(or) => ($crate::SyntaxKind::TOKEN_OR);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this here?

@oberblastmeister oberblastmeister merged commit a22982c into nix-community:master Jul 23, 2022
oberblastmeister pushed a commit to oberblastmeister/rnix-parser that referenced this pull request Jul 23, 2022
@oberblastmeister
Copy link
Contributor

Either or needs to be added as a new token in the lexer, or a special case is needed. @oxalica can you do this? see #115

@oxalica
Copy link
Contributor Author

oxalica commented Jul 24, 2022

@oberblastmeister

Either or needs to be added as a new token in the lexer, or a special case is needed.

I created #116 for the fix.

BTW: You are able to push to my PR branch before merging it, if you have write access to the repo, I think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Doesn't parse valid Nix edge-cases
4 participants