Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy set patterns (argument unpacking) #9159

Open
RuRo opened this issue Oct 15, 2023 · 7 comments
Open

Lazy set patterns (argument unpacking) #9159

RuRo opened this issue Oct 15, 2023 · 7 comments
Labels
feature Feature request or proposal

Comments

@RuRo
Copy link

RuRo commented Oct 15, 2023

Consider the following example:

let
  overlay1 = final: let inherit (final) a b; in { c = a + b; };
  overlay2 = final @ { a, b, ... }:             { c = a + b; };

  overlay = overlay1;
  # overlay = overlay2;

  self = (
    { a = 1; } //
    (overlay self) //
    { b = 2; }
  );
in
  self

Currently, overlay1 works as expected (producing { a = 1; b = 2; c = 3; }), but overlay2 fails with an infinite recursion error despite being semantically equivalent (I think?).

error:
       … while evaluating the file 'example.nix':

       … in the right operand of the update (//) operator

         at example.nix:8:16:

            7|   self = (
            8|     { a = 1; } //
             |                ^
            9|     (overlay self) //

       … in the left operand of the update (//) operator

         at example.nix:9:20:

            8|     { a = 1; } //
            9|     (overlay self) //
             |                    ^
           10|     { b = 2; }

       … from call site

         at example.nix:9:6:

            8|     { a = 1; } //
            9|     (overlay self) //
             |      ^
           10|     { b = 2; }

       error: infinite recursion encountered

       at «none»:0: (source not available)

It would be nice if overlay2 just worked as well.

Additional context

Tangentially related to #4090.

Priorities

Add 👍 to issues you find important.

@RuRo RuRo added the feature Feature request or proposal label Oct 15, 2023
@infinisil
Copy link
Member

Yeah this is somewhat unexpected, when you destructure an attribute set argument, it does so before evaluating the function body. @roberth has a PR to help with this in Nixpkgs: NixOS/nixpkgs#194514

@RuRo
Copy link
Author

RuRo commented Oct 16, 2023

I really feel that the relaxed behavior I am proposing should be implemented in Nix rather than in Nixpkgs.

Notice, that my (fairly short) example doesn't use lib, but only plain nix. The lazier Nix is out of the box - the less "false" infinite recursion errors there will be for our users to stumble into. We should avoid offloading language "band aids" onto nixkpgs lib.

Also, I would like to point out that there are some quite significant differences between what I am proposing and what lib.lazyFunction is supposed to do. For example, lazyFunction makes the function non-strict with regards to "extra attributes" and it doesn't work with ... and @args.

My proposal on the other hand doesn't change the "extra attribute" strictness and as a consequence it ONLY works with argument expansions that have a ... in them. Argument unpacking patterns WITHOUT a ... would still have to completely eagerly evaluate all of their attribute names to maintain the strictness.

@roberth
Copy link
Member

roberth commented Oct 19, 2023

Changing the language is risky. Strictness can sometimes be better, as it tends to have shorter traces that are more relevant.
That said, not having an error at all is even better of course, so I agree that making attrset functions strict is not great.

It may be possible to make { a }: f a behave as x: f (g x) where g checks that the attrNames are only ["a"] and f is lazy. It would complicate the evaluator a bit, which might lead to worse performance and/or fewer opportunities for optimization.

If #4090 were to be implemented, such a function could still be considered too strict, for evaluating the whole attrset instead of only the attr it needs. In other words, by making attrset laziness fine grained, it requires any checks to be strict.

It's not a coincidence that the words "strict" is applied both to things that check, and to things that are not lazy.

@roberth
Copy link
Member

roberth commented Oct 19, 2023

One possible way to preserve the ability to write strict attrset functions, is to add a feature.
For instance, we could borrow ~ from Haskell's lazy pattern matching. We'd have to figure out where to add it in the syntax.

Example

nix repl> f = ~{ a, b, c }: a+b
nix repl> :p f { a = 1; b = 2; }
3
# c was never checked for because the attrset match or "pattern" was lazy

It may also be possible to add ~ to individual attribute names, if we implement something like g from the previous comment.

I don't like adding to the language, but at least by having the feature we can both shine a light on the issue of strictness and provide a convenient solution.

@RuRo
Copy link
Author

RuRo commented Oct 19, 2023

It seems that my initial proposal wasn't super clear, so let me try to clarify my point.

I am proposing to change the current behaviour of the Nix language ONLY for argument unpacking with ... already present. That is, my proposed lazy behaviour would only affect { a, b, ... }: a + b and { a, ... }@args: a + args.other, but NOT { a, b }: a + b.

Argument unpacking with ... is currently already non-strict in NIX:

let
  fun = { a, b, ... }: a + b;
  oops = {
    a = 1;
    b = 2;
    c = builtins.throw "never evaluated";
  };
in
  fun oops # returns 3

I am simply proposing to extend this non-strictness to also affect attribute names (not just the values).
With my proposed semantics, {a, b}: a + b remains strict and non-lazy, because you have to evaluate all the names in order to check that no more attributes are present in the attrset (except a and b).

What I am proposing is similar to #4090, but much smaller in scope. I believe, that my proposal should be fairly easy to implement, because unlike #4090, it doesn't require completely changing the semantics of the language in regard to attrset keys.

In somewhat simplified pseudocode:

  • when evaluating an expression of the form { <UNPACK> }@attrs: <FUNCTION-BODY>
  • check if <UNPACK> has an ellipsis
    • if it does, replace argument unpacking with whatever inherit (attrs) <UNPACK> would do
    • if it doesn't, don't change anything, retain the current strict unpacking semantics

Edit: okay, so it's not quite "whatever inherit (attrs) <UNPACK> would do". As it turns out, inherit (attrs) missing; doesn't check if missing is present in attrs if missing isn't actually used anywhere in the body. So the proposed behaviour would have to insert a builtins.seq or something. So that this still fails even with the proposed lazy semantics

let
  fun = { a, b, c, ... }: a + b;
  oops = { a = 1; b = 2; d = "is not c"; };
in
  fun oops # should still fail with "function 'fun' called without required argument 'c'"

@roberth
Copy link
Member

roberth commented Oct 20, 2023

Yeah, it's subtle.

c = builtins.throw "never evaluated";

We only need to be concerned with the names. Attribute values are consistently non-strict.

Argument unpacking with ... is currently already non-strict

This just isn't the case.

in NIX

Nitpick: Nix.


I feel like the idea of it being lazy but still checked on access just blurs the line too much for something that has a performance cost as well. To make it behave like f (check x), we'll need an extra thunk to represent the work of checking the args (avoiding O(nArgs²) checking work). What if we have args@ as well, and args gets passed to another function? We'll have a buildup of such thunks in memory until the innermost function returns or triggers all the checks. (a "space leak")
Unchecked lazy functions that behave like args: let inherit (args) a1 a2; in (ie a1 = args.a1 etc) don't suffer from this performance problem and seem easier to understand.

@RuRo
Copy link
Author

RuRo commented Oct 20, 2023

Yeah, it's subtle.

c = builtins.throw "never evaluated";

We only need to be concerned with the names. Attribute values are consistently non-strict.

Argument unpacking with ... is currently already non-strict

This just isn't the case.

Actually, the fact that the value of c is never evaluated is completely besides the point in my example (my bad, I shouldn't have included the red herring of builtins.throw).

My point was that { a, b, ... }@maybeArgs: is non-strict with regards to the presence of attribute names other than a and b. That's the whole point of the ... - to remove the restriction on the presence of extra arguments. Currently, the ... argument names are semantically non-strict (not checked, not used in the computation), but practically non-lazy (aka eager, aka strict).

To me, this seems like an implementation detail. The whole point of ... is to go "yeah, and also some other args maybe, I don't really care". There is really no point in eagerly evaluating all of the attribute names when you explicitly don't care about most of them. The fact that nix does evaluate them feels more like a minor oversight rather than a deliberate design choice.


I feel like the idea of it being lazy but still checked on access just blurs the line too much for something that has a performance cost as well. To make it behave like f (check x), we'll need an extra thunk to represent the work of checking the args (avoiding O(nArgs²) checking work). What if we have args@ as well, and args gets passed to another function? We'll have a buildup of such thunks in memory until the innermost function returns or triggers all the checks. (a "space leak") Unchecked lazy functions that behave like args: let inherit (args) a1 a2; in (ie a1 = args.a1 etc) don't suffer from this performance problem and seem easier to understand.

To be honest, I don't think I quite follow what you are arguing here? Are you talking about the {a,b,c,...}: a+b case where c is accepted as an argument, but not explicitly used? Are you proposing that we shouldn't check for the existence of the attribute c, because that would incur a significant performance/memory overhead?

Honestly, I am kind of +/-0 on this issue. Ideally I would prefer to retain the current semantics, but if it's really impossible to "inline" this check somewhere reasonable, then I am totally fine with using the "plain" inherit (args) <UNPACK>; in semantics even if it can lead to slightly weird results if not all arguments are used.

Actually, unused unpacked arguments could be pretty easily caught with static analysis (whether by linters or by nix itself).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature request or proposal
Projects
None yet
Development

No branches or pull requests

3 participants