Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib.getOptionalAttrs: init #269586

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ShamrockLee
Copy link
Contributor

@ShamrockLee ShamrockLee commented Nov 24, 2023

Description of changes

This PR introduces lib.getOptionalAttrs, a handy function that returns a sub- attribute according to the input list of attribute names. Unlike lib.getAttrs, it skips attributes that are not in the input attribute set instead of throwing the error of missing attributes. It acts as the complement of removeAttrs.

Changes during PR review:

  • Rename getExistingAttrs -> getOptionalAttrs.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 23.11 Release Notes (or backporting 23.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Priorities

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added the 6.topic: lib The Nixpkgs function library label Nov 24, 2023
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Nov 24, 2023
lib/attrsets.nix Outdated Show resolved Hide resolved
@ShamrockLee ShamrockLee force-pushed the lib-get-existing-attrs branch from 5e273b8 to f7f1c5d Compare November 24, 2023 18:02
@ShamrockLee ShamrockLee requested a review from Kiskae November 24, 2023 18:04
@infinisil
Copy link
Member

What are the use cases?

And in any case, needs tests

@ShamrockLee
Copy link
Contributor Author

What are the use cases?

It's used inside the definition of lib.extendDerivation' introduced in #269785. I defined it locally there to avoid unnecessary coupling between PRs.

I also find it useful when trying to implement the support for multiple VSCode extension registries (#121583).

And in any case, needs tests

Where could I add a test case for Nixpkgs lib functions?

@Kiskae
Copy link
Contributor

Kiskae commented Nov 26, 2023

Where could I add a test case for Nixpkgs lib functions?

https://github.com/NixOS/nixpkgs/blob/master/lib/tests/misc.nix

@ShamrockLee ShamrockLee force-pushed the lib-get-existing-attrs branch from f7f1c5d to c0c124a Compare November 26, 2023 12:50
@ShamrockLee
Copy link
Contributor Author

And in any case, needs tests

Tests added.

@ShamrockLee
Copy link
Contributor Author

The ofborg bot seems broken.

lib/attrsets.nix Outdated Show resolved Hide resolved
@ShamrockLee ShamrockLee force-pushed the lib-get-existing-attrs branch from c0c124a to dda3c4b Compare November 27, 2023 09:54
@ShamrockLee ShamrockLee requested a review from Kiskae November 27, 2023 09:54
@roberth
Copy link
Member

roberth commented Nov 27, 2023

This is a generalized intersectAttrs, so I would prefer to call it by that name.
See NixOS/nix#9050

@infinisil
Copy link
Member

I don't think we need to try to reuse names of builtins, which due to their nature cannot ever be renamed and as such sometimes have kind of bad names.

In particular intersectAttrs feels like it should be a symmetric operation, something like (a -> a -> a) -> Attrs -> Attrs -> Attrs, where the (a -> a -> a) is called when there's a shared attribute.

A list-based function is even less symmetrical, so I think this name fits even less.

Here's some alternate ideas:

  • getAttrNames
  • selectAttrs
  • selectAttrNames
  • filterAttrsByNames
  • limitAttrs
  • getExistingAttrs (current PR)
  • intersectAttrs (@roberth's proposal)

@ShamrockLee ShamrockLee force-pushed the lib-get-existing-attrs branch 2 times, most recently from 8ab7d67 to 8e6cf7a Compare November 27, 2023 18:45
@ShamrockLee ShamrockLee changed the title lib.getExistingAttrs: init lib.intersectAttrs: init Nov 27, 2023
@ShamrockLee ShamrockLee changed the title lib.intersectAttrs: init lib.getExistingAttrs: init Nov 27, 2023
@ShamrockLee ShamrockLee force-pushed the lib-get-existing-attrs branch from 8e6cf7a to 7b8e93a Compare November 27, 2023 18:55
@ShamrockLee
Copy link
Contributor Author

ShamrockLee commented Nov 27, 2023

In particular intersectAttrs feels like it should be a symmetric operation, something like (a -> a -> a) -> Attrs -> Attrs -> Attrs, where the (a -> a -> a) is called when there's a shared attribute.

I agree that the word intersect should imply something symmetric. lib.intersectLists does prefer the second one over the first one by inherits its order of elements, but it is at least the mathematical set intersection of the elements from both lists.

Here's some alternate ideas:
* getAttrNames
* selectAttrs
* selectAttrNames
* filterAttrsByNames
* limitAttrs
* getExistingAttrs (current PR)
* intersectAttrs (@roberth's proposal)

Regarding these candidates, I like selectAttrs and getExistingAttrs better.

  • limitAttrs doesn't contain information about how the limitation is applied.
  • selectAttrNames sounds like the output would be a list of names like builtins.attrNames
  • filterAttrsByNames seems good at the first sight, but it could also be interpreted as a simpler version of filterAttrs that takes (name: decision) instead of (n: v: decision).

Update:
It seems that filterAttrsByNames is okay, as long as it is not filterAttrsByName.

@ShamrockLee
Copy link
Contributor Author

BTW, I have generalized the implementation to also take an attribute set aside from a list of strings, which meets the expectation of NixOS/nix#9050.

@roberth
Copy link
Member

roberth commented Nov 27, 2023

I don't think we need to try to reuse names of builtins,

I we should lean into this: lib as a better replacement for builtins with more intuitive behavior (think foldl') and backcompat with older Nix versions, but no massive changes like having an incompatible signature.

cannot ever be renamed and as such sometimes have kind of bad names.

They are established names, and established names are hard to erase, so I see improving their behavior as a good thing.

I don't see having multiple functions doing the same thing as a good situation to have.
As pointed out in the issue, Nix itself is currently inconsistent about those two functions, but can be improved without breaking changes. lib only needs to polyfill the new behavior.

In particular intersectAttrs feels like it should be a symmetric operation, something like (a -> a -> a) -> Attrs -> Attrs -> Attrs, where the (a -> a -> a) is called when there's a shared attribute.

This would have been good when the builtin was first added, but that could now be intersectAttrsWith instead.

@infinisil
Copy link
Member

Please check out NixOS/nix#9050 (comment), where I proposed this:

  • lib.attrsets.removeKeys :: List -> Attrs -> Attrs
  • lib.attrsets.selectKeys :: List -> Attrs -> Attrs

Would be very consistent and not conflict with any builtins names.

@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Mar 20, 2024
@ShamrockLee
Copy link
Contributor Author

I'll revert back to getExistingAttrs :: [ String ] -> AttrSet -> AttrSet then.

We could extend it later if we like to.

@ShamrockLee ShamrockLee force-pushed the lib-get-existing-attrs branch from 7b8e93a to 18680a0 Compare March 24, 2024 13:04
@ShamrockLee ShamrockLee force-pushed the lib-get-existing-attrs branch from 18680a0 to b6e8f56 Compare March 24, 2024 13:33
@ShamrockLee
Copy link
Contributor Author

Bump!

@ofborg ofborg bot added 10.rebuild-darwin: 1-10 10.rebuild-darwin: 1 and removed 2.status: merge conflict This PR has merge conflicts with the target branch 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin labels Mar 24, 2024
# A list of attribute names to get out of `attrs`.
names:
# The set to get the named attributes from.
attrs: getAttrs (intersectLists (attrNames attrs) names) attrs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe an implementation using intersectAttrs would be more efficient.

@roberth
Copy link
Member

roberth commented Mar 26, 2024

How about getOptionalAttrs?
It's close to both getAttrs and optionalAttrs, both of which are relevant, and quite consistent.
It also avoids exists, which hasn't been used in a context like this in the library, and reminds me of first-order logic, which is not relevant here.

@infinisil
Copy link
Member

infinisil commented Mar 26, 2024

Re #269586 (comment):

I also find it useful when trying to implement the support for multiple VSCode extension registries (#121583).

(getExistingAttrs [ "description" "homepage" ] ref)

There you can just do this instead:

{ inherit (ref) description homepage"; }

It's used inside the definition of lib.extendDerivation' introduced in #269785

) (shiftOverriders outputName overridersFromMain // getExistingAttrs overriderNames outputDrv)
else shiftOverriders outputName (getExistingAttrs overriderNames passthru);

That's a better use case.


I guess the problem is really that we have two ways to represent a set of strings. On one hand there's List String ([ "a" "b" "c" ]), which has the problem that it doesn't allow efficient operations.

On the other hand there's Map String Any ({ a = null; b = null; c = null; }) which has the problem that it has unnecessary values, so it's annoying to construct.

Here's an idea for an alternative: Let's have a function that allows easily constructing an attribute set from a list of strings:

nullAttrs = attrs: genAttrs attrs (_: null)

This way

getExistingAttrs names attrs
==
intersectAttrs (nullAttrs names) attrs

This also makes it clear that if you have multiple such calls, the nullAttrs call can be shared between them, whereas with getExistingAttrs it can't.

What do you think?

@ShamrockLee
Copy link
Contributor Author

ShamrockLee commented Mar 26, 2024

Thank you for spending time reading the hyperlinked PRs.

Re #269586 (comment):

I also find it useful when trying to implement the support for multiple VSCode extension registries (#121583).

(getExistingAttrs [ "description" "homepage" ] ref)

There you can just do this instead:

{ inherit (ref) description homepage"; }

You're right. I cannot recall why I did it that way back then.

It's used inside the definition of lib.extendDerivation' introduced in #269785

) (shiftOverriders outputName overridersFromMain // getExistingAttrs overriderNames outputDrv)
else shiftOverriders outputName (getExistingAttrs overriderNames passthru);

That's a better use case.

Thank you.

I guess the problem is really that we have two ways to represent a set of strings. On one hand there's List String ([ "a" "b" "c" ]), which has the problem that it doesn't allow efficient operations.

On the other hand there's Map String Any ({ a = null; b = null; c = null; }) which has the problem that it has unnecessary values, so it's annoying to construct.

Here's an idea for an alternative: Let's have a function that allows easily constructing an attribute set from a list of strings:

nullAttrs = attrs: genAttrs attrs (_: null)

This way

getExistingAttrs names attrs
==
intersectAttrs (nullAttrs names) attrs

This also makes it clear that if you have multiple such calls, the nullAttrs call can be shared between them, whereas with getExistingAttrs it can't.

I'm haven't dug into the C++ implementation of Nix Language lists and attribute sets yet. So the point is, lib.intersectLists is slow (operating two linked lists with $O(n \log n)$ or even $O(n^2)$ time complexity), and builtins.intersectAttrs is so fast that constructing an attribute set of nulls is worth it. Is that right?

If the recommended way to implement is,

names: intersectAttrs (nullAttrs name)

it would be better to interpolate builtins.intersectAttrs into lib as lib.attrsets.intersectAttrs.

@infinisil
Copy link
Member

it would be better to interpolate builtins.intersectAttrs into lib as lib.attrsets.intersectAttrs.

I'm all for that, I think all lib-like builtins should be exposed like that!

@roberth
Copy link
Member

roberth commented Mar 27, 2024

Actually, nullAttrs seems quite expensive, allocating another list and n nameValuePair attrsets. I think I wrongly assumed genAttrs was a builtin.

Unfortunately, the builtins don't offer many ways to go from lists to attrsets.

Here's another possible implementation that uses single-attribute attrsets instead of nameValuePairs, but also n singleton lists. Those are represented quite efficiently, so although this is an odd solution, it might be competitive.

  getExistingAttrs = l: p:
    lib.zipAttrsWith
      (name: values: head values)
      (map
        (n: if p?${n} then { ${n} = p.${n}; } else { })
        l);

If we really want performance, we should probably do NixOS/nix#9050 or make genAttrs into a builtin.
That function is hardly used within the library (or the module system therefore), so I'm not sure that we get a real-world gain from that; not until a use case like this becomes more prevalent in actual hot code.

@ShamrockLee ShamrockLee force-pushed the lib-get-existing-attrs branch from b6e8f56 to 37b853d Compare March 27, 2024 03:13
@ShamrockLee ShamrockLee changed the title lib.getExistingAttrs: init lib.getOptionalAttrs: init Mar 27, 2024
@ShamrockLee
Copy link
Contributor Author

How about getOptionalAttrs? It's close to both getAttrs and optionalAttrs, both of which are relevant, and quite consistent. It also avoids exists, which hasn't been used in a context like this in the library, and reminds me of first-order logic, which is not relevant here.

Now it's called getOptionalAttrs.

@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin and removed 10.rebuild-darwin: 1 10.rebuild-darwin: 1-10 labels Mar 27, 2024
@infinisil
Copy link
Member

Actually, nullAttrs seems quite expensive, allocating another list and n nameValuePair attrsets. I think I wrongly assumed genAttrs was a builtin.

We can optimise that later if necessary. In turn, an interface like getOptionalAttrs <list> <attrs> can fundamentally not be optimised further.

I'm really not convinced by the overall motivation for this function and would like to see nullAttrs instead as a dedicated way to turn a list into an attribute set.

@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: lib The Nixpkgs function library 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants