Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function for transforming store path contents #264541

Open
infinisil opened this issue Oct 31, 2023 · 10 comments
Open

Function for transforming store path contents #264541

infinisil opened this issue Oct 31, 2023 · 10 comments
Assignees
Labels
0.kind: enhancement Add something new 6.topic: lib The Nixpkgs function library

Comments

@infinisil
Copy link
Member

Issue description

I've been working on the new file set library in recent months, which allows selecting local eval-time paths to add to the store. Notably this library does not support handling of store paths, see #264537 for why not.

However, it would be very doable to have a simple function for doing arbitrary transformations over store paths with a command. I can imagine an interface like this:

pkgs.transformStorePath {
  path = pkgs.hello;
  command = ''
    rm -rf share
    mv $out/bin/{hello,hallo}
  '';
}

I can also imagine there being alternate more convenient approaches, suggestions welcome.

Ping @roberth @fricklerhandwerk

This issue is sponsored by Antithesis

@fricklerhandwerk
Copy link
Contributor

It seems enough to document that one can arbitrarily transform store paths at build time. We already have plenty of shell scripting helpers, no need to add another interface for general computation.

@infinisil
Copy link
Member Author

@fricklerhandwerk It doesn't have to be this exact interface, and maybe if we have the right use cases, there's a case to be made for a composable API to transform store paths too.

However I think even this very simple command-based API would be worth having, because it's currently not easily doable with other functions:

  • pkgs.srcOnly { src = pkgs.hello; } doesn't quite work, because it doesn't support running a command. This could be supported by fixing its installPhase though, such that
    pkgs.srcOnly {
      src = pkgs.hello;
      postInstall = "...";
    }
    works. Still, srcOnly is weird in that it supports arbitrary attributes and passing those to mkDerivation. I'd rather deprecate it than make it be the default for store path transformations.
  • The following doesn't quite work nicely, because it doesn't run any unpack phase on the source, so you'd have to manually cp and chmod the source
    pkgs.runCommand "..." {
      src = pkgs.hello;
    } ''
      ...
    ''
    This could also be fixed by making it run the standard unpack phase, such that the above works.

In comparison, having a new well-scoped function for just transforming a store path might be a really nice primitive to have.

Of course, if we have any use cases that even need this, so far I don't know of any.

@bew
Copy link
Contributor

bew commented Nov 9, 2023

For inspiration/discussion maybe:

I have a usecase and a specialized implementation in my dotfiles of a content transformation like this, to replace the binaries in a pkg by my own:
https://github.com/bew/dotfiles/blob/0fa57358fa5a67f1/nix/homes/mylib/mybuilders.nix#L133-L205
And I'm using it in 4 places at the moment.

Another usecase I had was to patch incoming sources (didn't find a better way than a manual drv):
https://github.com/bew/keyboard-qmk-moonlander/blob/ab7b1363ecb6ccd81/flake.nix#L30-L46
Later in that file I'm also adding my own code in an existing source pkg via a postUnpack hook (not sure if it's the best way to do this):
https://github.com/bew/keyboard-qmk-moonlander/blob/ab7b1363ecb6ccd81/flake.nix#L107-L113


Of course, if we have any use cases that even need this, so far I don't know of any.

Could the (many) wrapper packages we have be a usecase for this? Where the 'unwrapped' pkg is transformed to have some kind of new behavior

@roberth
Copy link
Member

roberth commented Nov 28, 2023

Ideally we'd be able to produce the same store path through this. This would then allow cache sharing when NixOS/nix#9259 is implemented.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-11-28-nixpkgs-architecture-team-meeting-46/36171/1

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/easy-source-filtering-with-file-sets/29117/15

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/is-it-possible-to-create-a-fileset-from-a-derivation-output-path/42194/2

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/is-there-a-function-or-helper-for-patching-derivations-with-scripts/42809/2

@tomodachi94 tomodachi94 added 0.kind: enhancement Add something new 6.topic: lib The Nixpkgs function library labels May 12, 2024
@roberth
Copy link
Member

roberth commented Jul 7, 2024

Proposal: lib.fileset.serializable, a smaller dialect that can be executed by derivations

The crucial limitation of derivations is that we can't serialize Nix functions into it,¹ so I think it makes sense to name it after this property.

This sublibrary would consist primarily of lazy, mostly dumb constructors, that construct a data structure without performing any computation.
The only behavior that occurs in union, intersect, etc, is the automatic conversion from path values to filesets.
pathExists is not called.

After constructing the serializable fileset, the only two operations to consume it are

  • a conversion function to regular filesets
  • lib.fileset.serializable.toJSON { root, fileset }

toJSON maps over all the path values stored in fileset to compute relative paths from root to each path. This operation produces an equivalent data structure, except the location in path value space is lost; exactly what we need.

Example output

{ "unions": [
  { "path": "./src/foo.cc" },
  { "byExt": {
    "root": { "path": "./include" },
    "ext": ".hh"
  }}
]}

Note that we can't have a filter function, so we'll need special purpose combinators.

In Nixpkgs we'll implement a program that can perform a filtering copy using the JSON as its input, and a simple runCommand-flavored derivation helper.

Details

  • Maybe this should be pkgs.filesets instead of lib.filesets.serializable?
  • Maybe byExt is too specific. Make the interface itself more powerful with e.g. regex matching? byExt could be implemented in pure Nix by translating its specific behavior to a regex.
  • The source filtering derivation really wants to be a derivation with content addressed output, so that it benefits from early cutoff. When a change anywhere in the fileset root happens, all source filters must rerun, but most of them produce the same output. Making that output content-addressed means that it is as effective at prevent rebuilds as evaluation-time fileset evaluation. (Not counting the cheap filtering drvs of course)

Resolving path values works well

  • Indirections for packaging meson-based granular build for Nixpkgs nix#11055 uses it, but only for the root parameter, ignoring the actual fileset. This proposal would allow the fileset to be serialized too.
    • basically cardboard mvp that is actually already useful; just not hermetic and doesn't prevent rebuilds, but is interface compatible:
      root = ./.;
      src = fetchFromGitHub foo;
      resolveRelPath = p: lib.path.removePrefix root p;
      resolvePath = p: src + "/${resolveRelPath p}";
      filesetToSource = { root, fileset }: resolvePath root;
      # arrange package.nix files to accept filesetToSource as argument
      # it makes them vendorable into Nixpkgs despite their fileset

¹ It's not fundamentally impossible, but very complex to implement, and very likely to make your drv hashes inadvertently depend on evaluator implementation details.

@ursi
Copy link
Contributor

ursi commented Dec 1, 2024

if you have a practical use case let me know in #264541
I just tried to add a purescript module from a git repository. Unfortunately it had decided to include a module whose name collided with a module in my dependencies already, so it wasn't working. I used builtins.path to filter that file out of the source, in order to use the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: enhancement Add something new 6.topic: lib The Nixpkgs function library
Projects
None yet
Development

No branches or pull requests

7 participants