Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macOS: unexpected EOF reading a line #3605

Open
roberth opened this issue May 21, 2020 · 45 comments
Open

macOS: unexpected EOF reading a line #3605

roberth opened this issue May 21, 2020 · 45 comments
Labels
bug build-problem Nix fails to compile or test; also improvements to build process macos Nix on macOS, aka OS X, aka darwin

Comments

@roberth
Copy link
Member

roberth commented May 21, 2020

Describe the bug

Nix has failed to build on darwin with this log.
https://hydra.nixos.org/build/119296096/nixlog/1

It did succeed afterwards, so unless someone recognizes what's the problem in the log, it's ok to close this and move on.

The failure happens in test case tests/user-envs.sh, with the last couple of lines being:

    29 store paths deleted, 0.01 MiB freed
    + test -e /private/tmp/nix-build-nix-2.3.4.drv-0/nix-test/store/8fid272lbh1milx38cni6wi6dx7v8z7b-foo-1.0
    + '[' -e /private/tmp/nix-build-nix-2.3.4.drv-0/nix-test/store/kzm50sz0sp6gsvav10kx8xasppw0yrql-foo-2.0 ']'
    + nix-env -e '*'
    uninstalling 'foo-1.0'
    uninstalling 'bar-0.1'
    building '/private/tmp/nix-build-nix-2.3.4.drv-0/nix-test/store/prvd5da3s52k2gr2pvx510wvn7shyjqh-user-environment.drv'...
    created 0 symlinks in user environment
    ++ nix-env -q '*'
    ++ wc -l
    + test 0 -eq 0
    + nix-env -i foo
    installing 'foo-2.0'
    these derivations will be built:
      /private/tmp/nix-build-nix-2.3.4.drv-0/nix-test/store/wsrlj1hjzncx3g2i1jp65lbb1v3py2wp-foo-2.0.drv
    building '/private/tmp/nix-build-nix-2.3.4.drv-0/nix-test/store/wsrlj1hjzncx3g2i1jp65lbb1v3py2wp-foo-2.0.drv'...
    error: unexpected EOF reading a line

Steps To Reproduce

This may be a race condition or other intermittent bug.

The entire log of the build failure is available here:
https://hydra.nixos.org/build/119296096/nixlog/1

Expected behavior

Nix just builds on darwin without a test failure.

nix-env --version output

N/A

Additional context

@roberth roberth added the bug label May 21, 2020
@edolstra
Copy link
Member

This test has been randomly failing for a while.

@domenkozar
Copy link
Member

domenkozar commented Jun 19, 2020

In case it's relevant, it also fails on bar:

    + nix-env -i bar-0.1
    installing 'bar-0.1'
    these derivations will be built:
      /private/tmp/nix-build-nix-2.3.6.drv-0/nix-test/store/qk9gcc2say7156yhmdzdiwrybm5gyi0s-bar-0.1.drv
    building '/private/tmp/nix-build-nix-2.3.6.drv-0/nix-test/store/qk9gcc2say7156yhmdzdiwrybm5gyi0s-bar-0.1.drv'...
    error: unexpected EOF reading a line

@domenkozar
Copy link
Member

#3137 looks quite relevant, there might be a bug in Nix.

@domenkozar
Copy link
Member

Possibly relevant: #1704

@domenkozar domenkozar added the macos Nix on macOS, aka OS X, aka darwin label Jun 19, 2020
@abathur
Copy link
Member

abathur commented Aug 16, 2020

I'm not sure it's the same issue since the context is different, but I've run into some flaky instances of this message. Let me know if you think I should open this separately?

I took a little time this week to pick at writing a plain bash script to set up Nix on travis-CI in a way that is hopefully more maintainable than the Ruby language integration, but as soon as I had the install working correctly I started noticing flaky failures with this message:

installing 'nix-2.3.7'
error: unexpected EOF reading a line

---- oh no! --------------------------------------------------------------------

Some notes:

  • It doesn't always fail
  • It's not always the same job that fails
  • I have yet to observe the linux build fail.
  • I have yet to observe the macOS 10.13 build fail (I have seen every macOS config 10.14+ fail at least once).

Edit: I've force a rebuild on 10.13 every time I can remember for the past 5 days and I've finally strung together about 60 builds without a single 10.13 failure. Given how common the 10.14 and 10.15 failures have been, I'm feeling fairly confident at this point that (again, assuming this is the same as the issue observed here) that this issue doesn't manifest on 10.13.

I haven't quite decided yet if I should still see it as a blocker for my purpose or not 😬

Here are links to specific runs:

  • 4 runs that each had a single (generally different) job fail: 16, 17, 18, 22
  • 2 that ran all clean: 19, 20

@abathur
Copy link
Member

abathur commented Aug 23, 2020

I saw a fresh probable instance of this on CI today:

installing 'nix-2.3.7'

building '/nix/store/hqqr5jaxy2kaxij5jzz3kd8gd63qfdkm-user-environment.drv'...

created 7 symlinks in user environment

installing 'nss-cacert-3.49.2'

building '/nix/store/055gdwqrwg839skhjcp65r2inyzdxs1p-user-environment.drv'...

created 9 symlinks in user environment

unpacking channels...

error: unexpected EOF reading a line

error: program '/nix/store/5ira7xgs92inqz1x8l0n1wci4r79hnd0-nix-2.3.7/bin/nix-env' failed with exit code 1

�[32mAlright! We're done!

�[0m

But fetching the nixpkgs channel failed. (Are you offline?)

To try again later, run "sudo -i nix-channel --update nixpkgs".



Before Nix will work in your existing shells, you'll need to close

them and open them again. Other than that, you should be ready to go.

@abathur
Copy link
Member

abathur commented Aug 30, 2020

I asked @domenkozar about this on IRC and he confirmed that he has not seen this on github-actions macOS runners running either 10.14 or 10.15.

@abathur
Copy link
Member

abathur commented Sep 11, 2020

I have been running Nix builds on a 2013 MBA running macOS 10.14 as I poke at Big Sur installer updates, and I've been seeing spotty EOF errors (on the scale of 0-4 per distinct attempt to build). Most of them have been in tests/user-envs.sh, but I've also seen tests/remote-store.sh come up at least once.

Edit: It seems like these have been growing more common as I kept building. I've attached a log of a loop I did where it tried and failed more than 10 times. At some point it got stuck, so I killed it. After a reboot it built fine.

@domenkozar
Copy link
Member

domenkozar commented Feb 1, 2021

Logging test is now failing too quite often with:

building '/private/tmp/nix-build-nix-2.4pre19700101_d0b74e2.drv-0/nix-test/logging/store/k6n0xmmfjbwn35jm7g5d82lvdgzx0vgb-dependencies-input-1.drv'...
    building '/private/tmp/nix-build-nix-2.4pre19700101_d0b74e2.drv-0/nix-test/logging/store/q5x298m0dagw5njwpvp0838hvkprfrkm-dependencies-input-0.drv'...
    error: unexpected EOF reading a line

@domenkozar
Copy link
Member

There's now debugging prints in Nix master that shows there's no contents where Nix expects it and thus fails.

I'd like to offer $100 from https://opencollective.com/nix-macos fund to anyone that fixes this.

@domenkozar
Copy link
Member

Offering $150 now to whomever fixes it.

@domenkozar
Copy link
Member

Note that this happens also in the wild:

installing 'nss-cacert-3.49.2'
error: unexpected EOF reading a line

Existing debugging doesn't yield much insight, I wonder what other information could we extract to see why setting up build environment fails?

@roberth
Copy link
Member Author

roberth commented Jun 28, 2021

  • Maybe wait for the child process exit code when its output is empty, so we can log it?
  • Maybe the problem is in sandbox-exec. I don't know how bad sandbox-exec logging is. I know launchctl likes to just exit(3) and print nothing... Maybe we need to keep the sandbox definition file around for manual inspection?

@domenkozar
Copy link
Member

It's not using sandbox in the tests, so the builder is actually just the derivation?

@domenkozar
Copy link
Member

Using -vvv:

building of '/private/tmp/nix-build-nix-2.4pre19700101_9066e26.drv-0/nix-test/tests/remote-store/store/6yc0jmq89cg99cv9h81a9xa8lpmnlhi9-foo-1.0.drv!*' from .drv file: woken up
    executing builder '/nix/store/c9cxq1583a85bsq76q2rbnbiwwp7ygxr-bash-4.4-p23/bin/bash'
    killing process 16455
    lock released on '/private/tmp/nix-build-nix-2.4pre19700101_9066e26.drv-0/nix-test/tests/remote-store/store/1b4y7qh4iljb6p8n8vgqyq6wf33phzms-foo-1.0.lock'
    building of '/private/tmp/nix-build-nix-2.4pre19700101_9066e26.drv-0/nix-test/tests/remote-store/store/6yc0jmq89cg99cv9h81a9xa8lpmnlhi9-foo-1.0.drv!*' from .drv file: goal destroyed
    6 operations
    error: writing to file: Bad file descriptor

           … while setting up the build environment

@domenkozar
Copy link
Member

That seems somewhat related, but the log using -vvv of the original issue:

building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/j3d5lfl6gs5iz9j2pw47bjn5i3dhyg2s-foo-1.0.drv!*' from .drv file: woken up
    executing builder '/nix/store/c9cxq1583a85bsq76q2rbnbiwwp7ygxr-bash-4.4-p23/bin/bash'
    sandbox setup: Generated sandbox profile:
    sandbox setup: (version 1)
    sandbox setup: (import "sandbox-minimal.sb")
    sandbox setup:
    building '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/j3d5lfl6gs5iz9j2pw47bjn5i3dhyg2s-foo-1.0.drv'...
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/j3d5lfl6gs5iz9j2pw47bjn5i3dhyg2s-foo-1.0.drv!*' from .drv file: got EOF
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/j3d5lfl6gs5iz9j2pw47bjn5i3dhyg2s-foo-1.0.drv!*' from .drv file: woken up
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/j3d5lfl6gs5iz9j2pw47bjn5i3dhyg2s-foo-1.0.drv!*' from .drv file: build done
    killing process 39293
    builder process for '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/j3d5lfl6gs5iz9j2pw47bjn5i3dhyg2s-foo-1.0.drv' finished
    scanning for references for output 'out' in temp location '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/6gs8gaswwz2cbdylk6qlam945kckqgpm-foo-1.0'
    unreferenced input: '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/7mgzdii2ijr2qwqzx8lz2yl9aa6r0rs0-user-envs.builder.sh'
    lock released on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/6gs8gaswwz2cbdylk6qlam945kckqgpm-foo-1.0.lock'
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/j3d5lfl6gs5iz9j2pw47bjn5i3dhyg2s-foo-1.0.drv!*' from .drv file: done
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/j3d5lfl6gs5iz9j2pw47bjn5i3dhyg2s-foo-1.0.drv!*' from .drv file: goal destroyed
    acquiring write lock on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/var/nix/temproots/39287'
    downgrading to read lock on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/var/nix/temproots/39287'
    acquiring write lock on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/var/nix/temproots/39287'
    downgrading to read lock on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/var/nix/temproots/39287'
    locking path '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/94vhkasim5z09rqji819kj14cdgdh66s-env-manifest.nix'
    lock acquired on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/94vhkasim5z09rqji819kj14cdgdh66s-env-manifest.nix.lock'
    lock released on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/94vhkasim5z09rqji819kj14cdgdh66s-env-manifest.nix.lock'
    evaluating user environment builder
    acquiring write lock on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/var/nix/temproots/39287'
    downgrading to read lock on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/var/nix/temproots/39287'
    locking path '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv'
    lock acquired on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv.lock'
    lock released on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv.lock'
    instantiated 'user-environment' -> '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv'
    building user environment
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: created
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: woken up
    querying info about missing paths...
    starting pool of 2 threads
    entered goal loop
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: init
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: loading derivation
    acquiring write lock on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/var/nix/temproots/39287'
    downgrading to read lock on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/var/nix/temproots/39287'
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: have derivation
    acquiring write lock on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/var/nix/temproots/39287'
    downgrading to read lock on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/var/nix/temproots/39287'
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: all outputs substituted (maybe)
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: all inputs realised
    added input paths '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/6gs8gaswwz2cbdylk6qlam945kckqgpm-foo-1.0', '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/94vhkasim5z09rqji819kj14cdgdh66s-env-manifest.nix'
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: woken up
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: trying to build
    locking path '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/8jf772nvspz8p7xmrbzqapnhr1dzpxkc-user-environment'
    lock acquired on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/8jf772nvspz8p7xmrbzqapnhr1dzpxkc-user-environment.lock'
    removing invalid path '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/8jf772nvspz8p7xmrbzqapnhr1dzpxkc-user-environment'
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: woken up
    executing builder 'builtin:buildenv'
    killing process 39303
    lock released on '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/8jf772nvspz8p7xmrbzqapnhr1dzpxkc-user-environment.lock'
    building of '/private/tmp/nix-build-nix-tests.drv-0/nix-test/tests/user-envs/store/v1v9cifi74510l77da78sljsnd0wkinr-user-environment.drv!*' from .drv file: goal destroyed
    error: unexpected EOF reading a line

It seems like that buildenv builder is having issues sometimes.

@domenkozar
Copy link
Member

domenkozar commented Jun 30, 2021

We have a heisen bug, as soon as I add debugging statements to buildenv, the issue disappears.

@domenkozar
Copy link
Member

Opened #4965

@domenkozar
Copy link
Member

cc @lheckemann as you helped debug previous EOF :)

@stale stale bot removed the stale label Dec 20, 2022
@DrewMcArthur
Copy link

i think i was getting a similar error when running nix-shell --verbose (truncated output in this gist), but trying again with nix-shell -vvv worked. weird! (macOS 12.6.2)

@Et7f3
Copy link
Contributor

Et7f3 commented Feb 18, 2023

I checked the various test on master and run them a bunch of time and no failure. I have even checkout this PR #3605 (comment) I can't seems to reproduce. I have also removed the snippet that retry on failure. For the next time someone see a failure can you provide: how you got your nix (checksum of channel or of flake) Have you sandbox enabled (I have sandbox enabled. Which command you used. If you have a repro somewhere snapshot in a branch and don't touch it.

@Et7f3
Copy link
Contributor

Et7f3 commented Feb 18, 2023

Logging test is now failing too quite often with:

building '/private/tmp/nix-build-nix-2.4pre19700101_d0b74e2.drv-0/nix-test/logging/store/k6n0xmmfjbwn35jm7g5d82lvdgzx0vgb-dependencies-input-1.drv'...
    building '/private/tmp/nix-build-nix-2.4pre19700101_d0b74e2.drv-0/nix-test/logging/store/q5x298m0dagw5njwpvp0838hvkprfrkm-dependencies-input-0.drv'...
    error: unexpected EOF reading a line

Can't test:

configure flags: --prefix=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2 --bindir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/bin --sbindir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/sbin --includedir=/nix/store/vs7za4as26jr2hlldgncl894l25apa6k-nix-2.4pre20210201_d0b74e2-dev/include --oldincludedir=/nix/store/vs7za4as26jr2hlldgncl894l25apa6k-nix-2.4pre20210201_d0b74e2-dev/include --mandir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/share/man --infodir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/share/info --docdir=/nix/store/43x3ppirx5w2ivs0xn2h9p46ldrrxf20-nix-2.4pre20210201_d0b74e2-doc/share/doc/nix --libdir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/lib --libexecdir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/libexec --localedir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/share/locale --sysconfdir=/etc
checking for a sed that does not truncate output... /nix/store/9z7rc09qr2j09smg1jyjbck7j393k668-gnused-4.8/bin/sed
checking build system type... x86_64-apple-darwin22.3.0
checking host system type... x86_64-apple-darwin22.3.0
checking for the canonical Nix system name... x86_64-darwin
checking for gcc... clang
checking whether the C compiler works... no
configure: error: in `/private/tmp/nix-build-nix-2.4pre20210201_d0b74e2.drv-0/source':
configure: error: C compiler cannot create executables
See `config.log' for more details
error: builder for '/nix/store/8iw5chn5yzxy2cnrhiysjq7x3s8r1s64-nix-2.4pre20210201_d0b74e2.drv' failed with exit code 77;
       last 10 log lines:
       > configure flags: --prefix=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2 --bindir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/bin --sbindir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/sbin --includedir=/nix/store/vs7za4as26jr2hlldgncl894l25apa6k-nix-2.4pre20210201_d0b74e2-dev/include --oldincludedir=/nix/store/vs7za4as26jr2hlldgncl894l25apa6k-nix-2.4pre20210201_d0b74e2-dev/include --mandir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/share/man --infodir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/share/info --docdir=/nix/store/43x3ppirx5w2ivs0xn2h9p46ldrrxf20-nix-2.4pre20210201_d0b74e2-doc/share/doc/nix --libdir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/lib --libexecdir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/libexec --localedir=/nix/store/c66q47v26iidd41blmif331590hhsh8f-nix-2.4pre20210201_d0b74e2/share/locale --sysconfdir=/etc
       > checking for a sed that does not truncate output... /nix/store/9z7rc09qr2j09smg1jyjbck7j393k668-gnused-4.8/bin/sed
       > checking build system type... x86_64-apple-darwin22.3.0
       > checking host system type... x86_64-apple-darwin22.3.0
       > checking for the canonical Nix system name... x86_64-darwin
       > checking for gcc... clang
       > checking whether the C compiler works... no
       > configure: error: in `/private/tmp/nix-build-nix-2.4pre20210201_d0b74e2.drv-0/source':
       > configure: error: C compiler cannot create executables
       > See `config.log' for more details
       For full logs, run 'nix log /nix/store/8iw5chn5yzxy2cnrhiysjq7x3s8r1s64-nix-2.4pre20210201_d0b74e2.drv'.

@fricklerhandwerk fricklerhandwerk added the build-problem Nix fails to compile or test; also improvements to build process label Mar 13, 2023
@fricklerhandwerk
Copy link
Contributor

Possibly related: #7242

@fricklerhandwerk fricklerhandwerk moved this to 👀 In review in Nix team Mar 13, 2023
@fricklerhandwerk fricklerhandwerk moved this from 👀 In review to ⏰ Postponed in Nix team Mar 13, 2023
@Ericson2314 Ericson2314 removed this from Nix team Mar 13, 2023
@fricklerhandwerk
Copy link
Contributor

Discussed in the Nix team meeting:

  • @roberth: one possible location could be in the builder setup
  • @roberth: should retry in the main code, because we can do it more aggressively
    • makes no sense to retry the test 100 times, but we may do in the main loop
  • @Ericson2314: it would be good to confirm the retry in CI
  • @Ericson2314: may want to talk to the foundation board about collecting and handing out bounties in a more organised fashion
  • it's important, @edolstra will look into it
  • The team overall is not actively pursuing this now, but we are not marking it "postponed" because others should feel free to work on it.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-03-13-nix-team-meeting-minutes-40/26309/1

edolstra added a commit to edolstra/nix that referenced this issue Mar 15, 2023
Hopefully this fixes "unexpected EOF" failures on macOS (NixOS#3137, NixOS#3605,

The problem appears to be that under some circumstances, macOS
discards the output written to the slave side of the
pseudoterminal. Hence the parent never sees the "sandbox initialized"
message from the child, even though it succeeded. The conditions are:

* The child finishes very quickly. That's why this bug is likely to
  trigger in nix-env tests, since that uses a builtin builder. Adding
  a short sleep before the child exits makes the problem go away.

* The parent has closed its duplicate of the slave file
  descriptor. This shouldn't matter, since the child has a duplicate
  as well, but it does. E.g. moving the close to the bottom of
  startBuilder() makes the problem go away. However, that's not a
  solution because it would make Nix hang if the child dies before
  sending the "sandbox initialized" message.

* The system is under high load. E.g. "make installcheck -j16" makes
  the issue pretty reproducible, while it's very rare under "make
  installcheck -j1".

As a fix/workaround, we now open the pseudoterminal slave in the
child, rather than the parent. This removes the second condition
(i.e. the parent no longer needs to close the slave fd) and I haven't
been able to reproduce the "unexpected EOF" with this.
edolstra added a commit to edolstra/nix that referenced this issue Mar 15, 2023
Hopefully this fixes "unexpected EOF" failures on macOS
(NixOS#3137, NixOS#3605, NixOS#7242, NixOS#7702).

The problem appears to be that under some circumstances, macOS
discards the output written to the slave side of the
pseudoterminal. Hence the parent never sees the "sandbox initialized"
message from the child, even though it succeeded. The conditions are:

* The child finishes very quickly. That's why this bug is likely to
  trigger in nix-env tests, since that uses a builtin builder. Adding
  a short sleep before the child exits makes the problem go away.

* The parent has closed its duplicate of the slave file
  descriptor. This shouldn't matter, since the child has a duplicate
  as well, but it does. E.g. moving the close to the bottom of
  startBuilder() makes the problem go away. However, that's not a
  solution because it would make Nix hang if the child dies before
  sending the "sandbox initialized" message.

* The system is under high load. E.g. "make installcheck -j16" makes
  the issue pretty reproducible, while it's very rare under "make
  installcheck -j1".

As a fix/workaround, we now open the pseudoterminal slave in the
child, rather than the parent. This removes the second condition
(i.e. the parent no longer needs to close the slave fd) and I haven't
been able to reproduce the "unexpected EOF" with this.
@benwaffle
Copy link

benwaffle commented Mar 21, 2023

Also seeing this on M1 mac when trying to use a flake for nix develop:

error:
       … while reading the response from the build hook

       error: unexpected EOF reading a line

in any version (2.11, 2.12, 2.13, 2.14)

Happy to help test/debug if someone walks me through it not sure if able to reproduce

@benwaffle
Copy link

Update: killing nix-daemon fixed it.

sudo pkill nix-daemon

@domenkozar
Copy link
Member

Hopefully fixed in #8049

@Ericson2314
Copy link
Member

Closing for now, if it comes back we can reopen it.

@roberth
Copy link
Member Author

roberth commented Oct 23, 2023

Another one, from this CI run

  • After an I/O error in sqlite
nix-tests>     this derivation will be built:
nix-tests>       /private/tmp/nix-build-nix-tests-2.16.3-against-2.16.3.drv-0/nix-test/tests/flakes/config/store/rxzg26cdi7vynwf7g7dkxzf5wzawcp2v-simple.drv
nix-tests>     error: executing SQLite statement 'pragma synchronous = off': disk I/O error, disk I/O error (in '/private/tmp/nix-build-nix-tests-2.16.3-against-2.16.3.drv-0/nix-test/tests/flakes/config/var/nix/db/db.sqlite')
nix-tests>     error:
nix-tests>            … while reading the response from the build hook
nix-tests>            error: unexpected EOF reading a line
nix-tests>     ++(config.sh:40) onError
nix-tests>     ++(/private/tmp/nix-build-nix-tests-2.16.3-against-2.16.3.drv-0/source/tests/common/vars-and-functions.sh:237) set +x
nix-tests>     config.sh: test failed at:
nix-tests>       main in config.sh:40

Ericson2314 pushed a commit to Ericson2314/nix that referenced this issue Oct 31, 2023
Hopefully this fixes "unexpected EOF" failures on macOS
(NixOS#3137, NixOS#3605, NixOS#7242, NixOS#7702).

The problem appears to be that under some circumstances, macOS
discards the output written to the slave side of the
pseudoterminal. Hence the parent never sees the "sandbox initialized"
message from the child, even though it succeeded. The conditions are:

* The child finishes very quickly. That's why this bug is likely to
  trigger in nix-env tests, since that uses a builtin builder. Adding
  a short sleep before the child exits makes the problem go away.

* The parent has closed its duplicate of the slave file
  descriptor. This shouldn't matter, since the child has a duplicate
  as well, but it does. E.g. moving the close to the bottom of
  startBuilder() makes the problem go away. However, that's not a
  solution because it would make Nix hang if the child dies before
  sending the "sandbox initialized" message.

* The system is under high load. E.g. "make installcheck -j16" makes
  the issue pretty reproducible, while it's very rare under "make
  installcheck -j1".

As a fix/workaround, we now open the pseudoterminal slave in the
child, rather than the parent. This removes the second condition
(i.e. the parent no longer needs to close the slave fd) and I haven't
been able to reproduce the "unexpected EOF" with this.

(cherry picked from commit c536e00)
@athre0z
Copy link
Member

athre0z commented Nov 23, 2023

I also ran into this on Darwin after updating Nix to 2.18.X (from 2.11.x). A reboot fixed the issue. Suspect that the daemon was simply incompatible with the client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug build-problem Nix fails to compile or test; also improvements to build process macos Nix on macOS, aka OS X, aka darwin
Projects
None yet
Development

No branches or pull requests