Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy local flakes to the store lazily #3121

Open
edolstra opened this issue Oct 7, 2019 · 64 comments · May be fixed by #6530
Open

Copy local flakes to the store lazily #3121

edolstra opened this issue Oct 7, 2019 · 64 comments · May be fixed by #6530
Labels
fetching Networking with the outside (non-Nix) world, input locking flakes significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.

Comments

@edolstra
Copy link
Member

edolstra commented Oct 7, 2019

Currently flakes are evaluated from the Nix store, so when using a local flake, it's first copied to the store. This means that

$ cd /path/to/nixpkgs
$ nix build .#hello

is a lot slower than the non-flake alternative

$ nix build -f . hello

Ideally, we would copy the flake to the store only when its outPath attribute is evaluated. However, we also need to ensure that it's not possible to access untracked files (i.e. we need to check every file against git ls-files).

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/flakes-without-git-copies-entire-tree-to-nix-store/10743/2

@hmenke
Copy link
Member

hmenke commented Jun 28, 2021

Still important.

@stale stale bot removed the stale label Jun 28, 2021
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/my-painpoints-with-flakes/9750/20

@edolstra edolstra modified the milestones: flakes-v3, nix-3.0 Aug 30, 2021
@L-as
Copy link
Member

L-as commented Sep 3, 2021

It would be nice if Nix could take advantage of the filesystem's native CoW functionality (if present) in order to speed up copying.
We discussed this briefly in #offtopic:nixos.org.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/is-it-possible-to-make-a-flake-that-has-no-source-tree/16037/2

@lilyball

This comment was marked as duplicate.

@lilyball
Copy link
Member

Also for context, in my case the flake was not in a git repo, it was just in a folder. Copying to the nix store is unacceptable because the folder contains multiple git repos along with all their build artifacts. Copying a git repo to the Nix store at least would avoid copying untracked files, but in my case it had hundreds of thousands of files and multiple gigabytes of data to look at and copy.

@Atemu
Copy link
Member

Atemu commented Nov 13, 2021

@lilyball @L-as Taking advantage of CoW doesn't work on Linux either due to a VFS limitation: #5513

One thing I'd like to understand in this issue is why a local flake can't be evaluated "directly" just like the old default.nix-style file evaluation.
I know copying has benefits for hermetic evaluation and such but I don't need that, like, at all.
Sure, remote flakes should be copied to the Nix store and that's really great functionality but I see no point whatsoever in doing the same for local flakes that are already in the FS and not expected to change without the user's knowledge.

@TLATER
Copy link

TLATER commented Nov 14, 2021

@Atemu from what I've read, it's to help enforce hermetic evaluation and avoid impurities. Presumably it also has advantages for code simplicity, because you don't need to write something separate for local flakes.

I agree it's not great UX for those of us who use flakes just to keep track of a dev shell, of course :)

@Atemu

This comment was marked as duplicate.

@TLATER
Copy link

TLATER commented Nov 14, 2021

but I don't see any point in hermetic eval on local files.

You might not realize you're using local files, accidentally sneak in state, and then be surprised when it doesn't evaluate in deployment (and be all "wait, isn't nix supposed to prevent this?"). Even with fully local files, I'd expect things still to work if I move my directory to a new computer from a restored backup. While I've personally learned when and where local state might happen, it's still a safety net that I consider nice to have.

Of course, giant copies for the tiniest delta is way too much of a cost to incur for that, but this is why we're here - to make sure that flakes don't blow up SSDs all over the place when they finally become non-experimental ;)

@Atemu
Copy link
Member

Atemu commented Nov 14, 2021

You might not realize you're using local files, accidentally sneak in state, and then be surprised when it doesn't evaluate in deployment (and be all "wait, isn't nix supposed to prevent this?").

I don't understand what you mean by that.

How is copying the accidentally added state over to the Nix store first and then evaling it any better than just evaling it directly?

Even with fully local files, I'd expect things still to work if I move my directory to a new computer from a restored backup. While I've personally learned when and where local state might happen, it's still a safety net that I consider nice to have.

How is the location of the directory related to any of this? A direct eval of the same state of a directory in another location will have the same result. How should copying improve anything?

@andir
Copy link
Member

andir commented Nov 14, 2021

IIRC files that are tracked with git already (and changed) are being staged and then copied to the store. I can see how this ensures that at least the files are tracked and marked as updated (by staging them). I also kind of agree that I think this is the wrong solution to the problem, or perhaps a solution in search of a problem? Most of the time it is very expensive to copy my working directory into the store.

Since I can see why that feature is useful, I'd argue that it should be configurable if you want your flake repos to be copied to the store or not. As far as I know, the hash of the path that is added to the store is also currently used for the eval caching.

Perhaps the current implementation is a nice PoC of how more proper hermetic eval could look like and what it gives us in terms of capabilities (caching, ...).

@Atemu
Copy link
Member

Atemu commented Nov 16, 2021

I can see how this ensures that at least the files are tracked and marked as updated (by staging them).

That sounds like a sound reason but I can's see how that wouldn't just be possible with direct eval too.

@edolstra could we get some insight from you here?

@Kha
Copy link
Contributor

Kha commented Nov 16, 2021

It's definitely possible, just more work as described in the initial post:

However, we also need to ensure that it's not possible to access untracked files (i.e. we need to check every file against git ls-files).

Nix already has a "eval may only access these store paths" logic, but no "may only access tracked files of this Git checkout" logic yet, so using the former was the simplest solution I assume.

@Atemu
Copy link
Member

Atemu commented Nov 16, 2021

Another important point @rnhmjoj mentioned in Discourse is security. A user can easily unknowingly expose private/secret information globally on a system by building a local flake.

@lilyball
Copy link
Member

Hardlinks wouldn't work, the nix store needs to be read-only immutable files and hardlinks mean permissions are shared and editing one file edits both. If your filesystem supports copy-on-write then that should help, but it won't work if your nix store is on a separate volume (though hardlinks wouldn't work in that case either).

@Atemu
Copy link
Member

Atemu commented Oct 25, 2023

In newer version of Linux, you can reflink between vfs barriers as long as it's the same superblock. Btrfs for example supports this but ZFS does not.

Though I think the big problem with copying is mostly metadata and the associated random access, not the content.

@yajo
Copy link
Contributor

yajo commented Oct 25, 2023

is there any way to make Nix do the copying with hardlinks instead?

https://nixos.org/manual/nix/stable/command-ref/conf-file.html#conf-auto-optimise-store does that with files already found in the nix store. So, it won't hardlink them with those in your CWD, but if it has to copy the dirty flake once and again, at least that won't mean triplicated storage. "Only" ~duplicated.

@AleXoundOS
Copy link

In newer version of Linux, you can reflink between vfs barriers as long as it's the same superblock. Btrfs for example supports this but ZFS does not.

On my machine (NixOS 23.05, linux 6.5.5) if source directory and /nix/store are on the same Btrfs filesystem, data is not shared among copies after running nix develop, thus taking space. btrfs filesystem du shows exclusive data for each copy.

It would be nice if it worked, lowering the severity of the problem at least for Btrfs users (even though metadata of such copies still takes up space).

@Atemu
Copy link
Member

Atemu commented Oct 26, 2023

@AleXoundOS note that reflinking is something an application must choose to do; it is not the default. (IMO it really should be but that's a topic for a different project.)

I don't think it's worth spending time on this at this point though as it's much more important to push #6530 forward. After that's done, reflinking would be an extremely minor optimisation; probably not signficant enough to be worth considering.

@edolstra edolstra modified the milestones: nix-2.19, nix-2.20 Nov 20, 2023
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/using-nix-shells-without-polluting-repositories/37362/6

@wmertens
Copy link
Contributor

Looks like this is not getting anywhere for the moment, how about allowing listing necessary files in the flake?

E.g. the attribute files would be an optional list of relative paths, and if it exists, only those paths get copied. Directories get recursively copied.

To make it more dev-friendly, the list could be globs instead of paths, so that you can include or exclude.

@axelkar
Copy link

axelkar commented Mar 27, 2024

If you don't use IFD, can you just first load all nix files imported by flake.nix into memory and eval them then?

Can someone tell me why you'd need to copy directories flakes reside in into the store if self is not used as a path? E.g. not used at all or just an attr like self.packages.x86_64-linux.default. It could be alright for self + /pkgs/foo.nix but that doesn't justify copying every single file, does it?

What about simple devshell flakes? Why are their directories copied into the store? It's just one nix file that doesn't use ./. or self.

I personally use path:///home/axel/.nix-config just so I don't have to commit every single time I try a new change.
path:// doesn't copy anything into the store, right? Could there be a better way to bypass Git?

Edit: nevermind the last part, I thought path:// was going away. Anyways, could .git be copied or just read while evaluating? It ignores files from .gitignore and only contains committed files do it's perfectly pure. Can someone explain why it needs copying?

Edit 2: If something like a files attribute or .flakeignore were to be added could derivations use something like bind mounts or links to not have to copy anything to the store?

@2xsaiko
Copy link

2xsaiko commented Mar 27, 2024

I personally use path:///home/axel/.nix-config just so I don't have to commit every single time I try a new change.
path:// doesn't copy anything into the store, right? Could there be a better way to bypass Git?

You don't have to commit every time you make a change with the git fetcher ("tree is dirty" is just a warning) and flakes are always copied to the store, even with path URLs. If that doesn't work for you, have a shell.nix until #6530 is complete.

@tmillr
Copy link

tmillr commented Sep 21, 2024

What about unchanged files within the flake/source dir? Is there any way to keep those from getting duplicated in the store (e.g. have unchanged files in the new source copy use a hard or soft link to avoid wasting space)?

If I'm not mistaken, the flake source dir that's copied to the store is hashed at the root/dir level and not the file level, thus reusing unchanged files from an earlier flake eval or build is not implicitly done?

@fgaz
Copy link
Member

fgaz commented Sep 22, 2024

@tmillr you can set auto-optimise-store = true in nix.conf.

@Atemu
Copy link
Member

Atemu commented Sep 22, 2024

While that reduces the space usage needed asynchronously, it still does a copy which wastes IO time and write endurance.

@tmillr
Copy link

tmillr commented Sep 23, 2024

@tmillr you can set auto-optimise-store = true in nix.conf.

Thank you. I guess I forgot that that mechanism works based off of file content and not nix store hash (or something else).

I am using that currently, but I have to run a custom nix store optimise daemon/service on a schedule instead of using the option (as the option causes random build issues/failures on darwin last I checked).

It sounds like even using that option causes extra fs writes to occur however (but will work for saving space alone).

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/confused-about-nix-copying-nixpkgs/53737/2

@anka-213
Copy link

anka-213 commented Oct 3, 2024

note that reflinking is something an application must choose to do; it is not the default.

I don't think it's worth spending time on this at this point though as it's much more important to push #6530 forward. After that's done, reflinking would be an extremely minor optimisation; probably not signficant enough to be worth considering.

@Atemu Reflinking wouldn’t be significant for reducing the copies incurred by the eval, but it could still be significant in reducing the otherwise still necessary copies of inputs for the build itself, right?

@Atemu
Copy link
Member

Atemu commented Oct 3, 2024

I'm not sure what you mean by that. Reflink copies speed up copying any path to the Nix store aswell as reducing storage cost (though in an unpredictable manner) but the best thing to do on any case is to copy as few things to the Nix store as possible.

For reducing copies between existing build inputs in the Nix store, the right tool is Nix store optimisation as @fgaz mentioned and it's filesystem-agnostic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fetching Networking with the outside (non-Nix) world, input locking flakes significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.