Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: not an absolute path: 'nix-archive-1' with --store builds #6253

Closed
kevincox opened this issue Mar 13, 2022 · 4 comments · Fixed by #8805
Closed

error: not an absolute path: 'nix-archive-1' with --store builds #6253

kevincox opened this issue Mar 13, 2022 · 4 comments · Fixed by #8805
Labels

Comments

@kevincox
Copy link
Contributor

Describe the bug

Occasionally builds fail with the error:

error: not an absolute path: 'nix-archive-1'

This appears to occur when copying paths back from the builder when using --store.

I'm not sure but it also appears to occur most often when multiple builds are running in parallel. Maybe it has something to do with both builds waiting on one derivation and this somehow breaks the copy?

The issue is transient. I haven't see an issue where the first retry didn't fix it.

Steps To Reproduce

nix-build --store builder.example
# or
sudo nixos-rebuild boot --build-host builder.example

Expected behavior

The builds succeed reliably.

% nix-env --version
nix-env (Nix) 2.6.1
@kevincox kevincox added the bug label Mar 13, 2022
@colemickens
Copy link
Member

I have recently parallelized my deployments. This often involves concurrent nix copy operations that are either copying drvs/outs and/or evaling at the same time.

After the parallelization, I'm regularly seeing "sporadic" failures, and this is one that occurs pretty frequently.

==>> nix copy --derivation /nix/store/31r8a1vg9hk4n62y7h0jrd78824i735r-nixos-system-rpizerotwo1-22.11.20220530.35f6d41.drv --eval-store auto --to ssh-ng://root@pkta64.cloud.r10e.tech --no-check-sigs
copying 15 paths...
copying path '/nix/store/15ndjvv7mshgwznjm3c361spgzz3aa51-stage-1-init.sh.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/daykpcdggpzp2hj62zq7s350f9p5p3yr-tow-boot-update.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/1m7d6pyr4lkrwlnvhwzf6259xbhnl038-system-path.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/2yv9pdkzmrv5d7d86rsgw95jic4g2na0-initrd-linux-5.18.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/abcfrlwlsbw1dpni5ni98jf264yql4w3-etc-os-release.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/3i6ywdrkzmph9j7wjndvjaya6wy7dbdp-unit-systemd-fsck-.service.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/7ga729wkcsx9hj8r6qnrshl2765kh6ma-unit-polkit.service.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/mf0sj6q327kvxcm4qls1iydw66j8qchk-dbus-1.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/8x3dj795wy4f450vzqc39yk6hnwpw0xh-unit-dbus.service.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/hv794k5ry1c7ch7pj88wpgjbaggx6h56-system-units.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/l64h3x5w5bmqzl40qfd92kq17x5ak71f-issue.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/ilwn6qqfip0vqb0f2li0kjlbypxl34j3-unit-dbus.service.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/p8gvqjvkzbggmd2fnmbr9kj98ij10gin-user-units.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/qjsa0fjrn7zvc26makyk9272ynw15fvl-etc.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
copying path '/nix/store/31r8a1vg9hk4n62y7h0jrd78824i735r-nixos-system-rpizerotwo1-22.11.20220530.35f6d41.drv' to 'ssh-ng://root@pkta64.cloud.r10e.tech'...
error: not an absolute path: 'nix-archive-1'

@fogti
Copy link
Contributor

fogti commented Sep 15, 2022

this sounds strongly like a NAR parsing failure, although I don't think the same NAR should be parsed by multiple threads in parallel, or that a stream of NARs gets corrupted => so I guess missing synchronization of NAR stream parsing/handling, leading to interleaved NARs?

@gbpdt
Copy link
Contributor

gbpdt commented Sep 16, 2022

We found that #6612 fixes the issue, see #6730 (comment). I don't believe we've seen the issue since in our environment.

@abbec
Copy link

abbec commented Jan 31, 2023

Correct me if I am wrong, but to me it looks like the path it takes inside LocalStore::addToStore could cause it to consider the path valid (if it was created by a parallel daemon process) and never actually consume the NAR data. This would make it error with this message when trying to parse the next ValidPathInfo here. I am going to try to drain the NAR archive and see if it fixes the issue.

This would not happen when a single copy process is active client-side since it would be preceded by a QueryValidPaths and only paths that are not valid would be part of the data sent from the client but it reproduces 100% if running two parallel nix copy.

simonrainerson added a commit to goodbyekansas/nix that referenced this issue Jan 31, 2023
When receiving a stream of NARs through the ssh-ng protocol, an already
existing path would cause the NAR archive to not be read in the stream,
resulting in trying to parse the NAR as a ValidPathInfo. This results in
the error message:
    error: not an absolute path: 'nix-archive-1'

Fixes NixOS#6253

Usually this problem is avoided by running QueryValidPaths before
AddMultipleToStore, but can arise when two parallel nix processes gets
the same response from QueryValidPaths. This makes the problem more
prominent when running builds in parallel.
simonrainerson added a commit to goodbyekansas/nix that referenced this issue Feb 14, 2023
When receiving a stream of NARs through the ssh-ng protocol, an already
existing path would cause the NAR archive to not be read in the stream,
resulting in trying to parse the NAR as a ValidPathInfo. This results in
the error message:
    error: not an absolute path: 'nix-archive-1'

Fixes NixOS#6253

Usually this problem is avoided by running QueryValidPaths before
AddMultipleToStore, but can arise when two parallel nix processes gets
the same response from QueryValidPaths. This makes the problem more
prominent when running builds in parallel.
thufschmitt pushed a commit to tweag/nix that referenced this issue Aug 7, 2023
When receiving a stream of NARs through the ssh-ng protocol, an already
existing path would cause the NAR archive to not be read in the stream,
resulting in trying to parse the NAR as a ValidPathInfo. This results in
the error message:
    error: not an absolute path: 'nix-archive-1'

Fixes NixOS#6253

Usually this problem is avoided by running QueryValidPaths before
AddMultipleToStore, but can arise when two parallel nix processes gets
the same response from QueryValidPaths. This makes the problem more
prominent when running builds in parallel.
thufschmitt pushed a commit to tweag/nix that referenced this issue Aug 7, 2023
thufschmitt pushed a commit that referenced this issue Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants