Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSX builders may have flaky disks/filesystems #41201

Closed
aidanhs opened this issue Apr 10, 2017 · 3 comments
Closed

OSX builders may have flaky disks/filesystems #41201

aidanhs opened this issue Apr 10, 2017 · 3 comments

Comments

@aidanhs
Copy link
Member

aidanhs commented Apr 10, 2017

See #41188 (comment)

@aidanhs
Copy link
Member Author

aidanhs commented Apr 10, 2017

Possibly cc #40802 if it is actually a network drive. Need to ask travis.

@aidanhs aidanhs changed the title OSX builders seem to use network partitions which may be flaky OSX builders seem to use network drives which may be flaky Apr 10, 2017
@aidanhs aidanhs changed the title OSX builders seem to use network drives which may be flaky OSX builders may have flaky disks Apr 11, 2017
@aidanhs
Copy link
Member Author

aidanhs commented Apr 11, 2017

Direct link to travis log: https://travis-ci.org/rust-lang/rust/jobs/220653967

For reference, here's the log snippet that I found odd (note that before this, cache restore was terminated for taking too long - could this have caused an odd state?):

[00:00:02] +'[' '!' -d /Users/travis/rustsrc/src/.git -o 0 '!=' 0 -o 128 '!=' 0 ']'
[00:00:02] +echo 'WARNING: /Users/travis/rustsrc/cache_valid1 exists but bad repo: l:       0, ec:128'
[00:00:02] +rm -rf /Users/travis/rustsrc
[00:00:02] WARNING: /Users/travis/rustsrc/cache_valid1 exists but bad repo: l:       0, ec:128
[00:00:15] rm: /Users/travis/rustsrc/src: Directory not empty
[00:00:15] rm: /Users/travis/rustsrc: Directory not empty

In the log above, an rm -rf command is complaining that directory isn't empty. In an ideal world, it's not possible to get this error.

Initially I suspected a bad and/or network drive (since that's what the internet mostly suggests), but my testing of the travis environment has found no evidence of autofs actually connecting over the network for /home (it'd be surprising if it did, admittedly). That's not conclusive, it may just be that I don't understand OSX well enough (are /home and /Users even related?). I then tried out a theory that osx gives different error messages than Linux if it fails to delete a folder due to lack of permission to delete the things inside, but no joy. So I'm back to a bad disk.

I've sent travis support an e-mail.

@aidanhs aidanhs changed the title OSX builders may have flaky disks OSX builders may have flaky disks/filesystems Apr 11, 2017
@aidanhs
Copy link
Member Author

aidanhs commented Apr 20, 2017

A possible explanation is that a file inside was owned by root and couldn't be deleted, which then made directory removal fail (note the cache restore failure, so indeed these files were owned by root).

This is a bit odd though, because usually you'd expect a log line saying "Permission denied" in that scenario, like https://travis-ci.org/aidanhs/rust-appveyor/jobs/222240678#L74-L86. But travis points out that FS corruption is unlikely (I do agree) so let's close this for now since the git cache is being reworked.

@aidanhs aidanhs closed this as completed Apr 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant