Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib.fileset.intersection: init #257356

Merged
merged 3 commits into from
Oct 11, 2023
Merged

lib.fileset.intersection: init #257356

merged 3 commits into from
Oct 11, 2023

Conversation

infinisil
Copy link
Member

@infinisil infinisil commented Sep 26, 2023

Description of changes

This is another split off from the WIP fileset combinators PR.
This adds one function:

  • lib.fileset.intersection: Computes the intersection between two file sets

This work is sponsored by Antithesis

Things done

  • Tests
  • Documentation

@github-actions github-actions bot added the 6.topic: lib The Nixpkgs function library label Sep 26, 2023
@infinisil infinisil mentioned this pull request Sep 25, 2023
7 tasks
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Sep 26, 2023
@infinisil infinisil force-pushed the fileset.intersect branch 2 times, most recently from cc723f2 to 98346b0 Compare October 3, 2023 22:27
@infinisil infinisil changed the title lib.fileset,intersect, lib.fileset.intersects: init lib.fileset,intersection: init Oct 3, 2023
@infinisil infinisil changed the title lib.fileset,intersection: init lib.fileset.intersection: init Oct 3, 2023
Comment on lines +141 to +159
### No intersection for lists

While there is `intersection a b`, there is no function `intersections [ a b c ]`.

Arguments:
- (+) There is no known use case for such a function, it can be added later if a use case arises
- (+) There is no suitable return value for `intersections [ ]`, see also "Nullary intersections" [here](https://en.wikipedia.org/w/index.php?title=List_of_set_identities_and_relations&oldid=1177174035#Definitions)
- (-) Could throw an error for that case
- (-) Create a special value to represent "all the files" and return that
- (+) Such a value could then not be used with `fileFilter` unless the internal representation is changed considerably
- (-) Could return the empty file set
- (+) This would be wrong in set theory
- (-) Inconsistent with `union` and `unions`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roberth @fricklerhandwerk This PR isn't done yet, but I wonder what you think of this design decision.

It's an interesting duality:

  • Union is mostly needed between many file sets (e.g. combining together the exact files you need)
  • Intersection is mostly needed between few file sets (e.g. limiting the selected files to some directory)

We might really only need unions (taking a list) and intersection (binary operation).
union (binary operation) doesn't seem that important, but I can imagine some situations where it could be useful and we have it already.

Notably Haskell's Data.Set also provides union, unions, intersection but no intersections (there is one, but it's in Data.Set.Internal).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have an interface that requires at least one set: intersect :: a -> [a] -> a. But probably a binary one is good enough. One can still fold over a list, then defining the base set will be required for sane results. That again we can add to documentation as an example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In math it's easy to invoke a somewhat implicitly defined "universe" set. Us finite mortals would have to make that explicit, which is another way to land on a -> List a -> a whereas @fricklerhandwerk seems to have arrived through NonEmptyList a -> a.

@infinisil infinisil force-pushed the fileset.intersect branch 3 times, most recently from 10a4700 to 81bb565 Compare October 4, 2023 19:35
@infinisil infinisil marked this pull request as ready for review October 4, 2023 19:36
@infinisil
Copy link
Member Author

This is ready to review now. There's tests and documentation now.

@delroth delroth added the 12.approvals: 1 This PR was reviewed and approved by one reputable person label Oct 8, 2023
@infinisil
Copy link
Member Author

@roberth This PR is ready to be merged imo, so I'm planning to merge it myself in a couple days unless you have some concerns until then. And if it looks good to you I wouldn't mind it being merged earlier of course :)


Arguments:
- Alternative: Use the common prefix of all base paths as the resulting base path
- (-) This is unnecessarily strict, because the purpose of the base path is to track the directory under which files _could_ be in the file set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't it already defined to be the most specific, longest path that contains them all? Or is it not feasible to make that truly an invariant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not defined like that yet. It's a bit tricky because file sets can be empty but still have a base path, but I think it could be done with some careful wording

lib/fileset/README.md Outdated Show resolved Hide resolved
Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion and some thoughts. LGTM.


Example:
# Limit the selected files to the ones in ./., so only ./src and ./Makefile
intersection ./. (unions [ ../LICENSE ./src ./Makefile ])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a more realistic example that readily applies in many projects:

let buildFiles = fileFilter (f: f.ext != "nix") ../.;
in intersection buildFiles (unions [ ../build-support ./src ./Makefile ])

Not sure if I got the filter right. Had to change LICENSE in order to realistically encounter .nix files in the union.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I wrote the fileFilter without a path (../.). This would have been feasible if in a design where we allow infinite, non-enumerable sets such as intersections [ ].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildFiles is particularly useful if you have multiple derivations that each build some part of that set.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fileFilter doesn't exist yet, but I just opened the PR for it yesterday: #260265

I can add this example in the PR once this one is merged

@delroth delroth added 12.approvals: 2 This PR was reviewed and approved by two reputable people and removed 12.approvals: 1 This PR was reviewed and approved by one reputable person labels Oct 11, 2023
infinisil and others added 2 commits October 11, 2023 16:17
Co-authored-by: Robert Hensing <robert@roberthensing.nl>
@delroth delroth removed the 12.approvals: 2 This PR was reviewed and approved by two reputable people label Oct 11, 2023
@infinisil infinisil merged commit 006b28f into NixOS:master Oct 11, 2023
6 checks passed
@infinisil infinisil deleted the fileset.intersect branch October 11, 2023 15:33
infinisil added a commit to tweag/nixpkgs that referenced this pull request Nov 15, 2023
While this change is backwards-incompatible, I think it's okay because:
- The `fileFilter` function is not yet in a stable NixOS release, it was only merged about [a month ago](NixOS#257356).
  - All public uses of the function on GitHub only pass a path
- Any `fileFilter pred fileset` can also be expressed as `intersection fileset (fileFilter pred path)` without loss of functionality.
  - This is furthermore pointed out in the new error message when a file set is passed
github-actions bot pushed a commit to arcnmx/nixpkgs-lib that referenced this pull request Nov 16, 2023
While this change is backwards-incompatible, I think it's okay because:
- The `fileFilter` function is not yet in a stable NixOS release, it was only merged about [a month ago](NixOS/nixpkgs#257356).
  - All public uses of the function on GitHub only pass a path
- Any `fileFilter pred fileset` can also be expressed as `intersection fileset (fileFilter pred path)` without loss of functionality.
  - This is furthermore pointed out in the new error message when a file set is passed
github-actions bot pushed a commit to nix-community/nixpkgs.lib that referenced this pull request Nov 19, 2023
While this change is backwards-incompatible, I think it's okay because:
- The `fileFilter` function is not yet in a stable NixOS release, it was only merged about [a month ago](NixOS/nixpkgs#257356).
  - All public uses of the function on GitHub only pass a path
- Any `fileFilter pred fileset` can also be expressed as `intersection fileset (fileFilter pred path)` without loss of functionality.
  - This is furthermore pointed out in the new error message when a file set is passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: lib The Nixpkgs function library 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants