Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib.fileset: Representation for empty file sets without a base path #257351

Merged
merged 1 commit into from
Oct 3, 2023

Conversation

infinisil
Copy link
Member

@infinisil infinisil commented Sep 25, 2023

Description of changes

Changes the internal fileset structure to also be able to represent empty file sets without any base path. This can then be used as the return value of unions [].

Once the lib.fileset.intersect combinator from the full WIP combinators is added, this will also be suitable as a return value for e.g. intersect /foo /bar.

Notably the new empty singleton value is not exposed in the interface, because I don't know of any use case for it. It could be added later if a use case emerges.

This work is sponsored by Antithesis

Things done

  • Added tests
  • Benchmarked (minor change, not significant)

@github-actions github-actions bot added the 6.topic: lib The Nixpkgs function library label Sep 25, 2023
@infinisil infinisil changed the title Make lib.fileset.unions [] work lib.fileset: Representation for empty file sets without a base path Sep 25, 2023
@infinisil infinisil mentioned this pull request Sep 25, 2023
7 tasks
lib/fileset/default.nix Outdated Show resolved Hide resolved
lib/fileset/default.nix Outdated Show resolved Hide resolved
lib/fileset/README.md Outdated Show resolved Hide resolved
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Sep 26, 2023
@fricklerhandwerk
Copy link
Contributor

fricklerhandwerk commented Sep 26, 2023

I'm not sure about the approach here. I understand that having a value to represent unions [] would be nice to have, but the special case clutters the code quite a bit, moving it past the sweet spot of readability, and that particular edge case seems really not that important. Also I don't see why intersect /foo /bar can't return { base = /; tree = null; }. The overall change is therefore of questionable utility to me.

@infinisil
Copy link
Member Author

Also I don't see why intersect /foo /bar can't return { base = /; tree = null; }`.

The motivation comes from this example, which should work:

toSource {
  root = ./foo;
  fileset = intersect ./foo ./.;
}

If intersect ./foo ./. would use the common prefix (./.) as the base path, toSource would complain that the base path is outside of the root, when clearly that's not what the user would expect. So the base path of intersect ./foo ./. should instead be ./foo.

Extending that, this should also succeed without an error:

toSource {
  root = ./foo;
  fileset = intersect ./foo ./bar;
}

though this would give an empty result, but that's as one would expect.

(this will definitely have to go into the design document)

@roberth
Copy link
Member

roberth commented Sep 26, 2023

I'd rather have a working abstraction than a leaky (broken) one that is "extra readable".

@fricklerhandwerk
Copy link
Contributor

So the base path of intersect ./foo ./. should instead be ./foo.

Agreed that the combination of intersect ./foo ./. and toSource { root = ./foo; } as you describe it may be confusing, but not convinced about intersect in isolation. Just from looking at it, it's not evident what the base path should be, and intuitively I'd say it's their common prefix. If you want it to be the longest path in the resulting set, it should be documented very visibly. Maybe even guard it by an invariant that the base path is never shortened in operations?

Only then it seems inevitable to produce a special kind of value for intersect ./foo ./bar.

@roberth: Fully agreed. A particularly involved implementation is still better with a well-articulated rationale.

@infinisil
Copy link
Member Author

infinisil commented Sep 26, 2023

Just from looking at it, it's not evident what the base path should be, and intuitively I'd say it's their common prefix. If you want it to be the longest path in the resulting set, it should be documented very visibly.

The base path is mostly an invisible implementation detail, users shouldn't have to know about it. The only interaction of it with the user is in this error message:

nix-repl> fileset.toSource { root = ./foo; fileset = ./.; }
error: lib.fileset.toSource: `fileset` could contain files in /home/user/src,
  which is not under the `root` (/home/user/src/foo). Potential solutions:
   - Set `root` to /home/user/src or any directory higher up.
     This changes the layout of the resulting store path.
   - Set `fileset` to a file set that cannot contain files outside the `root` (/home/user/src/foo).
     This could change the files included in the result.

This error should only be thrown when it's relevant, meaning when there could be some file outside the root in the fileset.

intersect ./foo ./. can never contain files outside ./foo, so it should not thrown this error. Similarly, intersect ./foo ./bar can never contain any files at all, so it should not throw that error either.

Maybe even guard it by an invariant that the base path is never shortened in operations?

Comparatively, union needs the common prefix path as the base path, it's a kind of duality! union ./foo ./bar can contain files from either, so the base path should be ./..

@infinisil infinisil force-pushed the fileset.empty branch 2 times, most recently from 34dde44 to 5141148 Compare September 29, 2023 14:22
@infinisil
Copy link
Member Author

I now updated this to use _internalIsEmptyBaseless, I liked this one the most. I'd rather not bikeshed over the name any longer, this is not exposed in the API so it can also be changed later if desired.

Furthermore, I added a section to the design document explaining why such a value is needed and the problem with alternatives. Valentin also confirmed in a meeting that he agrees with this being needed.

I think this is ready to be merged, and unless there are any complaints in the next couple days I'll merge this myself.

@delroth delroth added the 12.approvals: 1 This PR was reviewed and approved by one reputable person label Oct 2, 2023
`unions []` now works!

Notably the new empty value without a base is not exposed in the interface.
I don't know of any use case for it.
@infinisil
Copy link
Member Author

In PM's @fricklerhandwerk made the suggestion to call it ..withoutBase or ..noBase instead. That sounds good to me, so I now changed it to _internalIsEmptyWithoutBase.

Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One off-topic suggestion. LGTM!

@@ -41,9 +41,16 @@ An attribute set with these values:
- `_type` (constant string `"fileset"`):
Tag to indicate this value is a file set.

- `_internalVersion` (constant `2`, the current version):
- `_internalVersion` (constant `3`, the current version):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in a file like fileset/internal/README.md?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I view lib/fileset/README.md as the internal contributor documentation, where I think this makes sense to have. It's not part of the public interface, but you need to know about it to contribute, which can mean to change the internals

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should link to the user docs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now done in #258832 (already merged myself)

@delroth delroth added 12.approvals: 2 This PR was reviewed and approved by two reputable people and removed 12.approvals: 1 This PR was reviewed and approved by one reputable person labels Oct 3, 2023
@roberth roberth merged commit 812887f into NixOS:master Oct 3, 2023
7 checks passed
@infinisil infinisil deleted the fileset.empty branch October 3, 2023 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: lib The Nixpkgs function library 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 12.approvals: 2 This PR was reviewed and approved by two reputable people
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants