Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib.fileset.gitTracked: init #264891

Merged
merged 3 commits into from
Nov 16, 2023
Merged

lib.fileset.gitTracked: init #264891

merged 3 commits into from
Nov 16, 2023

Conversation

infinisil
Copy link
Member

@infinisil infinisil commented Nov 1, 2023

Description of changes

This adds two new functions to lib.fileset:

  • lib.fileset.gitTracked :: Path -> FileSet: Create a file set containing all Git-tracked files in the working tree of a local Git repository.
    lib.fileset.gitTracked ./.
  • lib.fileset.gitTrackedWith :: { recurseSubmodules :: Bool ? false } -> Path -> FileSet: Same, but optionally also recurse into submodules.
    lib.fileset.gitTrackedWith { recurseSubmodules = true; } ./.

These functions are useful since it's very common to only want to include files tracked by Git. So much so that Flakes does it by default.

Combined with the fact that Flakes currently always copies all files to the Nix store, these functions are somewhat useless with Flakes:

  • If you use this for a git+file: input, Flakes will have already implicitly filtered out all files not tracked by Git, including .git, so this function can't even be called on the local directory then.
  • If you use a path:. input instead of implicitly letting Flakes treat it as a Git directory, it will import all files into the Nix store, which is exactly what this library is trying to let you control.
  • Even disregarding the above, there's a problem preventing this function from being usable here as is: builtins.fetchGit dirty mode almost unusable in pure evaluation mode nix#9292

But for stable Nix, these functions are a requirement, since it's not otherwise possible to create a file set from just Git-tracked files (builtins.fetchGit returns a store path, which doesn't work with file sets).

This work is sponsored by Antithesis

Things done

  • Interface decided
  • Code comments
  • User documentation
  • Tests

@github-actions github-actions bot added the 6.topic: lib The Nixpkgs function library label Nov 1, 2023
@infinisil infinisil mentioned this pull request Nov 1, 2023
7 tasks
@infinisil infinisil changed the title lib.fileset.gitTracked: init lib.fileset.gitTracked: init Nov 1, 2023
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Nov 1, 2023
@infinisil infinisil mentioned this pull request Nov 1, 2023
3 tasks
@infinisil infinisil marked this pull request as ready for review November 3, 2023 00:39
@roberth
Copy link
Member

roberth commented Nov 6, 2023

gitTracked should be a consumer of the library rather than be part of it.
Its domain is git, not sets.

Please rename it to, say, lib.git.trackedFileSet.

@fricklerhandwerk
Copy link
Contributor

While this adds means to strengthen build purity, which user demand is driving this change?

@infinisil
Copy link
Member Author

infinisil commented Nov 8, 2023

@fricklerhandwerk Pretty much every Git-based project using stable Nix can benefit from this function, since it provides a good base set of files to further customise. Even manually it's not practically possible to do this without this function, because you'd have to list each file tracked in Git individually in a unions, there's no nice way to get the list of git-tracked files.

Comparatively, this is already possible with lib.sources, because you can just do lib.cleanSourceWith { src = builtins.fetchGit ./.; filter = ...; } or so.

There is a possibility of having a lib.git.lsTrackedFiles :: Path -> [ Path ], then one could do

# Equivalent to the proposed lib.fileset.gitTracked ./.
lib.fileset.unions (lib.git.lsTrackedFiles ./.)

But:

  • It's less lazy and less efficient, since everything file has to go through the length-strict list
  • It would be less discoverable and be more annoying to type

I'd prefer just having lib.fileset.gitTracked for convenience (whose importance we shouldn't underestimate).

@fricklerhandwerk
Copy link
Contributor

The fact that it can be done with existing library functions, especially as trivially as you're showing, and especially given you're already working on fromSource, indicates to me we shouldn't spend more energy here.

@infinisil
Copy link
Member Author

@fricklerhandwerk No it cannot be done with existing functions (unless you pretty much copy what I did in this PR). lib.git.lsTrackedFiles doesn't exist, and even if it did, it would be a bunch less efficient than the function in this PR.

I really think something like this PR is required for an enjoyable experience with the file set library. Nobody (me included) wants to have to manually filter out files not tracked by Git.

Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not pretty, the way it is, or where it is, but it's useful. One request; otherwise lgtm.

lib/fileset/default.nix Show resolved Hide resolved
@roberth
Copy link
Member

roberth commented Nov 10, 2023

Perhaps also relevant is a helper that only applies this when there's a .git at the expected location. That way users can support local development and tarballs simultaneously, e.g. if the code will be used in another project via niv or flakes.

@roberth
Copy link
Member

roberth commented Nov 10, 2023

Nobody (me included) wants to have to manually filter out files not tracked by Git.

Am I nobody to you? ;)

EDIT: actually it's not possible, because tracked and non-ignored are different sets, where tracked requires reading the staging area, which I don't think is feasible.

@infinisil
Copy link
Member Author

Perhaps also relevant is a helper that only applies this when there's a .git at the expected location. That way users can support local development and tarballs simultaneously, e.g. if the code will be used in another project via niv or flakes.

This could be done with

if builtins.pathExists ./.git then
  gitTracked ./.
else
  ./.

But yeah having a convenience function for this might be nice. Better in a separate PR though :)

EDIT: actually it's not possible, because tracked and non-ignored are different sets, where tracked requires reading the staging area, which I don't think is feasible.

Hehe yeah, I wonder if we should have a combination of the two though, something like "all files that would be tracked if you ran git add -A". Probably that would be

let
  tracked = gitTracked ./.;
  ignored = gitIgnored ./.;
in
union tracked
  (intersection
    (difference ./. tracked)
    (difference ./. ignored)
  )

(and probably some nice set identity could be used to make this smaller).

But that's also for another PR :)

*/
gitTracked =
/*
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to a local working directory of the Git repository whose tracked files to include.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to a local working directory of the Git repository whose tracked files to include.
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to a local Git repository.
  • Remove "working directory" because that's less unambiguously the root of the repo.
  • Don't state too much of the obvious.

Not sure where to document the return value, but I'll make another suggestion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think for correctness we should use working directory, because there's also bare repositories without a working directory, see https://git-scm.com/docs/gitrepository-layout#_description.

I do agree to remove the "whose tracked files to include" though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to a local working directory of the Git repository whose tracked files to include.
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to a local working directory of the Git repository.
This directory must contain a `.git` file or subdirectory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to a local working directory of the Git repository whose tracked files to include.
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to the working directory of a local Git repository.
This directory must contain a `.git` file or subdirectory.
  • Switch "the" and "a" around, so we don't suggest there's any choice of directories within a repo. (Technically it may have worktrees but that's not even of secondary importance, and can be fully explained by a small vagueness on our end when it comes to the definition of repository.)

If that's not ok, consider

Suggested change
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to a local working directory of the Git repository whose tracked files to include.
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to a local non-bare Git repository.
This directory must contain a `.git` file or subdirectory.

or if you don't mind an association with the worktree feature, we can be really brief:

Suggested change
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to a local working directory of the Git repository whose tracked files to include.
The [path](https://nixos.org/manual/nix/stable/language/values#type-path) to a local Git worktree with a valid `.git` file or directory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the first suggestion, will apply.

@@ -510,4 +516,107 @@ If a directory does not recursively contain any file, it is omitted from the sto
# We could also return the original fileset argument here,
# but that would then duplicate work for consumers of the fileset, because then they have to coerce it again
actualFileset;

/*
Like [`gitTrackedWith`](#function-library-lib.fileset.gitTrackedWith) with `{}` as the first argument, therefore using the defaults.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Like [`gitTrackedWith`](#function-library-lib.fileset.gitTrackedWith) with `{}` as the first argument, therefore using the defaults.
A fileset with all files from a repository; typically passed to [`intersect`](#function-library-lib.fileset.intersect).
This function behaves like [`gitTrackedWith { }`](#function-library-lib.fileset.gitTrackedWith) - using the defaults.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Like [`gitTrackedWith`](#function-library-lib.fileset.gitTrackedWith) with `{}` as the first argument, therefore using the defaults.
Create a file set containing all [Git-tracked files](https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository) in a repository.
This function behaves like [`gitTrackedWith { }`](#function-library-lib.fileset.gitTrackedWith) - using the defaults.

@ofborg ofborg bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 10, 2023
A configuration parameter for gitTrackedWith will be introduced in the
next commit
@infinisil
Copy link
Member Author

infinisil commented Nov 16, 2023

Force push to apply the suggestions: (diff)

Now rebasing on top of master, I think this is ready.

@ofborg ofborg bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 16, 2023
Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after

fetchResult = builtins.fetchGit path;
in
if inPureEvalMode then
throw "lib.fileset.gitTracked: This function is currently not supported in pure evaluation mode, since it currently relies on `builtins.fetchGit`, see https://github.com/NixOS/nix/issues/9292."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw "lib.fileset.gitTracked: This function is currently not supported in pure evaluation mode, since it currently relies on `builtins.fetchGit`, see https://github.com/NixOS/nix/issues/9292."
throw "lib.fileset.gitTracked: This function is currently not supported in pure evaluation mode, since it currently relies on `builtins.fetchGit`. See https://github.com/NixOS/nix/issues/9292."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test needs it too :/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha right, it's annoying but I'd be very much screwed without these tests

Pushed

@infinisil infinisil merged commit b399b68 into NixOS:master Nov 16, 2023
19 checks passed
@infinisil infinisil deleted the fileset.gitTracked branch November 16, 2023 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: lib The Nixpkgs function library 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants