Skip to content

Commit

Permalink
lib.fileset: Internal representation v3, empty value without a base
Browse files Browse the repository at this point in the history
`unions []` now works!

Notably the new empty value without a base is not exposed in the interface.
I don't know of any use case for it.
  • Loading branch information
infinisil committed Oct 3, 2023
1 parent 0dd560f commit 4f35f00
Show file tree
Hide file tree
Showing 4 changed files with 119 additions and 26 deletions.
29 changes: 28 additions & 1 deletion lib/fileset/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,16 @@ An attribute set with these values:
- `_type` (constant string `"fileset"`):
Tag to indicate this value is a file set.

- `_internalVersion` (constant `2`, the current version):
- `_internalVersion` (constant `3`, the current version):
Version of the representation.

- `_internalIsEmptyWithoutBase` (bool):
Whether this file set is the empty file set without a base path.
If `true`, `_internalBase*` and `_internalTree` are not set.
This is the only way to represent an empty file set without needing a base path.

Such a value can be used as the identity element for `union` and the return value of `unions []` and co.

- `_internalBase` (path):
Any files outside of this path cannot influence the set of files.
This is always a directory.
Expand Down Expand Up @@ -111,6 +118,26 @@ Arguments:
- (+) This can be removed later, if we discover it's too restrictive
- (-) It leads to errors when a sensible result could sometimes be returned, such as in the above example.

### Empty file set without a base

There is a special representation for an empty file set without a base path.
This is used for return values that should be empty but when there's no base path that would makes sense.

Arguments:
- Alternative: This could also be represented using `_internalBase = /.` and `_internalTree = null`.
- (+) Removes the need for a special representation.
- (-) Due to [influence tracking](#influence-tracking),
`union empty ./.` would have `/.` as the base path,
which would then prevent `toSource { root = ./.; fileset = union empty ./.; }` from working,
which is not as one would expect.
- (-) With the assumption that there can be multiple filesystem roots (as established with the [path library](../path/README.md)),
this would have to cause an error with `union empty pathWithAnotherFilesystemRoot`,
which is not as one would expect.
- Alternative: Do not have such a value and error when it would be needed as a return value
- (+) Removes the need for a special representation.
- (-) Leaves us with no identity element for `union` and no reasonable return value for `unions []`.
From a set theory perspective, which has a well-known notion of empty sets, this is unintuitive.

### Empty directories

File sets can only represent a _set_ of local files, directories on their own are not representable.
Expand Down
7 changes: 2 additions & 5 deletions lib/fileset/default.nix
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ If a directory does not recursively contain any file, it is omitted from the sto
lib.fileset.toSource: `root` is of type ${typeOf root}, but it should be a path instead.''
# Currently all Nix paths have the same filesystem root, but this could change in the future.
# See also ../path/README.md
else if rootFilesystemRoot != filesetFilesystemRoot then
else if ! fileset._internalIsEmptyWithoutBase && rootFilesystemRoot != filesetFilesystemRoot then
throw ''
lib.fileset.toSource: Filesystem roots are not the same for `fileset` and `root` ("${toString root}"):
`root`: root "${toString rootFilesystemRoot}"
Expand All @@ -170,7 +170,7 @@ If a directory does not recursively contain any file, it is omitted from the sto
lib.fileset.toSource: `root` (${toString root}) is a file, but it should be a directory instead. Potential solutions:
- If you want to import the file into the store _without_ a containing directory, use string interpolation or `builtins.path` instead of this function.
- If you want to import the file into the store _with_ a containing directory, set `root` to the containing directory, such as ${toString (dirOf root)}, and set `fileset` to the file path.''
else if ! hasPrefix root fileset._internalBase then
else if ! fileset._internalIsEmptyWithoutBase && ! hasPrefix root fileset._internalBase then
throw ''
lib.fileset.toSource: `fileset` could contain files in ${toString fileset._internalBase}, which is not under the `root` (${toString root}). Potential solutions:
- Set `root` to ${toString fileset._internalBase} or any directory higher up. This changes the layout of the resulting store path.
Expand Down Expand Up @@ -264,9 +264,6 @@ If a directory does not recursively contain any file, it is omitted from the sto
filesets:
if ! isList filesets then
throw "lib.fileset.unions: Expected argument to be a list, but got a ${typeOf filesets}."
else if filesets == [ ] then
# TODO: This could be supported, but requires an extra internal representation for the empty file set, which would be special for not having a base path.
throw "lib.fileset.unions: Expected argument to be a list with at least one element, but it contains no elements."
else
pipe filesets [
# Annotate the elements with context, used by _coerceMany for better errors
Expand Down
69 changes: 56 additions & 13 deletions lib/fileset/internal.nix
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ let
drop
elemAt
filter
findFirst
findFirstIndex
foldl'
head
Expand Down Expand Up @@ -64,7 +65,7 @@ rec {
# - Increment this version
# - Add an additional migration function below
# - Update the description of the internal representation in ./README.md
_currentVersion = 2;
_currentVersion = 3;

# Migrations between versions. The 0th element converts from v0 to v1, and so on
migrations = [
Expand All @@ -89,8 +90,34 @@ rec {
_internalVersion = 2;
}
)

# Convert v2 into v3: filesetTree's now have a representation for an empty file set without a base path
(
filesetV2:
filesetV2 // {
# All v1 file sets are not the new empty file set
_internalIsEmptyWithoutBase = false;
_internalVersion = 3;
}
)
];

_noEvalMessage = ''
lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.'';

# The empty file set without a base path
_emptyWithoutBase = {
_type = "fileset";

_internalVersion = _currentVersion;

# The one and only!
_internalIsEmptyWithoutBase = true;

# Double __ to make it be evaluated and ordered first
__noEval = throw _noEvalMessage;
};

# Create a fileset, see ./README.md#fileset
# Type: path -> filesetTree -> fileset
_create = base: tree:
Expand All @@ -103,14 +130,15 @@ rec {
_type = "fileset";

_internalVersion = _currentVersion;

_internalIsEmptyWithoutBase = false;
_internalBase = base;
_internalBaseRoot = parts.root;
_internalBaseComponents = components parts.subpath;
_internalTree = tree;

# Double __ to make it be evaluated and ordered first
__noEval = throw ''
lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.'';
__noEval = throw _noEvalMessage;
};

# Coerce a value to a fileset, erroring when the value cannot be coerced.
Expand Down Expand Up @@ -155,14 +183,20 @@ rec {
_coerce "${functionContext}: ${context}" value
) list;

firstBaseRoot = (head filesets)._internalBaseRoot;
# Find the first value with a base, there may be none!
firstWithBase = findFirst (fileset: ! fileset._internalIsEmptyWithoutBase) null filesets;
# This value is only accessed if first != null
firstBaseRoot = firstWithBase._internalBaseRoot;

# Finds the first element with a filesystem root different than the first element, if any
differentIndex = findFirstIndex (fileset:
firstBaseRoot != fileset._internalBaseRoot
# The empty value without a base doesn't have a base path
! fileset._internalIsEmptyWithoutBase
&& firstBaseRoot != fileset._internalBaseRoot
) null filesets;
in
if differentIndex != null then
# Only evaluates `differentIndex` if there are any elements with a base
if firstWithBase != null && differentIndex != null then
throw ''
${functionContext}: Filesystem roots are not the same:
${(head list).context}: root "${toString firstBaseRoot}"
Expand Down Expand Up @@ -311,7 +345,7 @@ rec {
# Special case because the code below assumes that the _internalBase is always included in the result
# which shouldn't be done when we have no files at all in the base
# This also forces the tree before returning the filter, leads to earlier error messages
if tree == null then
if fileset._internalIsEmptyWithoutBase || tree == null then
empty
else
nonEmpty;
Expand All @@ -321,7 +355,12 @@ rec {
# Type: [ Fileset ] -> Fileset
_unionMany = filesets:
let
first = head filesets;
# All filesets that have a base, aka not the ones that are the empty value without a base
filesetsWithBase = filter (fileset: ! fileset._internalIsEmptyWithoutBase) filesets;

# The first fileset that has a base.
# This value is only accessed if there are at all.
firstWithBase = head filesetsWithBase;

# To be able to union filesetTree's together, they need to have the same base path.
# Base paths can be unioned by taking their common prefix,
Expand All @@ -332,14 +371,14 @@ rec {
# so this cannot cause a stack overflow due to a build-up of unevaluated thunks.
commonBaseComponents = foldl'
(components: el: commonPrefix components el._internalBaseComponents)
first._internalBaseComponents
firstWithBase._internalBaseComponents
# We could also not do the `tail` here to avoid a list allocation,
# but then we'd have to pay for a potentially expensive
# but unnecessary `commonPrefix` call
(tail filesets);
(tail filesetsWithBase);

# The common base path assembled from a filesystem root and the common components
commonBase = append first._internalBaseRoot (join commonBaseComponents);
commonBase = append firstWithBase._internalBaseRoot (join commonBaseComponents);

# A list of filesetTree's that all have the same base path
# This is achieved by nesting the trees into the components they have over the common base path
Expand All @@ -351,14 +390,18 @@ rec {
setAttrByPath
(drop (length commonBaseComponents) fileset._internalBaseComponents)
fileset._internalTree
) filesets;
) filesetsWithBase;

# Folds all trees together into a single one using _unionTree
# We do not use a fold here because it would cause a thunk build-up
# which could cause a stack overflow for a large number of trees
resultTree = _unionTrees trees;
in
_create commonBase resultTree;
# If there's no values with a base, we have no files
if filesetsWithBase == [ ] then
_emptyWithoutBase
else
_create commonBase resultTree;

# The union of multiple filesetTree's with the same base path.
# Later elements are only evaluated if necessary.
Expand Down
40 changes: 33 additions & 7 deletions lib/fileset/tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -282,24 +282,27 @@ expectFailure 'toSource { root = ./.; fileset = ./a; }' 'lib.fileset.toSource: `

# File sets cannot be evaluated directly
expectFailure 'union ./. ./.' 'lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.'
expectFailure '_emptyWithoutBase' 'lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.'

# Past versions of the internal representation are supported
expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 0; _internalBase = ./.; }' \
'{ _internalBase = ./.; _internalBaseComponents = path.subpath.components (path.splitRoot ./.).subpath; _internalBaseRoot = /.; _internalVersion = 2; _type = "fileset"; }'
'{ _internalBase = ./.; _internalBaseComponents = path.subpath.components (path.splitRoot ./.).subpath; _internalBaseRoot = /.; _internalIsEmptyWithoutBase = false; _internalVersion = 3; _type = "fileset"; }'
expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 1; }' \
'{ _type = "fileset"; _internalVersion = 2; }'
'{ _type = "fileset"; _internalIsEmptyWithoutBase = false; _internalVersion = 3; }'
expectEqual '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 2; }' \
'{ _type = "fileset"; _internalIsEmptyWithoutBase = false; _internalVersion = 3; }'

# Future versions of the internal representation are unsupported
expectFailure '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 3; }' '<tests>: value is a file set created from a future version of the file set library with a different internal representation:
\s*- Internal version of the file set: 3
\s*- Internal version of the library: 2
expectFailure '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 4; }' '<tests>: value is a file set created from a future version of the file set library with a different internal representation:
\s*- Internal version of the file set: 4
\s*- Internal version of the library: 3
\s*Make sure to update your Nixpkgs to have a newer version of `lib.fileset`.'

# _create followed by _coerce should give the inputs back without any validation
expectEqual '{
inherit (_coerce "<test>" (_create ./. "directory"))
_internalVersion _internalBase _internalTree;
}' '{ _internalBase = ./.; _internalTree = "directory"; _internalVersion = 2; }'
}' '{ _internalBase = ./.; _internalTree = "directory"; _internalVersion = 3; }'

#### Resulting store path ####

Expand All @@ -311,6 +314,12 @@ tree=(
)
checkFileset './.'

# The empty value without a base should also result in an empty result
tree=(
[a]=0
)
checkFileset '_emptyWithoutBase'

# Directories recursively containing no files are not included
tree=(
[e/]=0
Expand Down Expand Up @@ -408,13 +417,30 @@ expectFailure 'toSource { root = ./.; fileset = unions [ ./. ./b ]; }' 'lib.file

# unions needs a list with at least 1 element
expectFailure 'toSource { root = ./.; fileset = unions null; }' 'lib.fileset.unions: Expected argument to be a list, but got a null.'
expectFailure 'toSource { root = ./.; fileset = unions [ ]; }' 'lib.fileset.unions: Expected argument to be a list with at least one element, but it contains no elements.'

# The tree of later arguments should not be evaluated if a former argument already includes all files
tree=()
checkFileset 'union ./. (_create ./. (abort "This should not be used!"))'
checkFileset 'unions [ ./. (_create ./. (abort "This should not be used!")) ]'

# unions doesn't include any files for an empty list or only empty values without a base
tree=(
[x]=0
[y/z]=0
)
checkFileset 'unions [ ]'
checkFileset 'unions [ _emptyWithoutBase ]'
checkFileset 'unions [ _emptyWithoutBase _emptyWithoutBase ]'
checkFileset 'union _emptyWithoutBase _emptyWithoutBase'

# The empty value without a base is the left and right identity of union
tree=(
[x]=1
[y/z]=0
)
checkFileset 'union ./x _emptyWithoutBase'
checkFileset 'union _emptyWithoutBase ./x'

# union doesn't include files that weren't specified
tree=(
[x]=1
Expand Down

0 comments on commit 4f35f00

Please sign in to comment.