-
Notifications
You must be signed in to change notification settings - Fork 13.3k
random fails/segfaults from the io-upstream merge when RUST_THREADS=2 on Mac #7797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Did not reproduce atop Linux. So this may be Mac specific (how much is platform dependent here? I guess we are at the mercy of the kernel's thread scheduling?) |
Raise ulimit on open file descriptors? We've seen that error recently |
Can someone in California with a Mac attempt to reproduce this via the command line I provided in the description? I have reproduced atop Mac OS X 10.7.5 and some variant of 10.8 (I don't have that machine at hand at the moment). I just want to make sure I'm not the only one capable of reproducing the issue. |
(mm, I'm a little embarrassed to note that the command I listed in the description says |
I have now confirmed that the problem arises with and without |
@graydon I tried to increase the max number of open files, namely by using this command Note also that when using |
I just remember brian had to reboot the mac builders recently to raise the fd limit, I don't remember if that's how he did it. Ask him when he's back? |
Same problem as #7772 |
yeah, okay, I'm willing to believe that the majority of the issues in the description are from #7772. So I'm closing as a duplicate That |
Summary: there is some non-deterministic crash while doing
make check
that is easier for me to reproduce when I set environment variableRUST_THREADS=2
during themake check
. This is on a build with both--enable-debug
and--enable-optimize
; I am not sure if removing those options actually makes the bug go away, or merely makes it harder to reproduce. (Update:--enable-optimize
is not relevant; it can be toggled on/off and the bug persists.)It appears to have been injected by commit 41dcec2
This is a merge commit by bors; its parent supplied by bors is: 137d1fb
Things seem to work fine on the parent.
I found during my attempts to bisect the history that I could not trust our build infrastructure to reliably update LLVM source and/or rebuild the object code (perhaps for LLVM or perhaps elsewhere), so I got in the habit of deleting the both the LLVM source tree and the entire build tree (and re-running configure) during the bisection. This of course makes the build take a bit longer. (But maybe now that I have narrowed it to this particular merge, perhaps one can be less conservative with how much cleanup one does before testing.)
So, to reproduce, here is the command I was running in the build tree:
(There is certainly steps that can be omitted from the above, like the
git log -1
, which was for my own logging purposes while I bisected.remake
is the fork of GNU make I often use: remake, but that is almost certainly orthogonal.)I wish I had better information (either a more specific commit from the originating branch, or a narrow test test), but as of yet I have not been able to gather such. I'm mostly filing this so that I keep a log of the crashes and any other detail I manage to gather.
A few of the crashes I saw occurred while testing
rustpkg
, and included this as the output:shortly before crashing; from what I can tell from skimming the source code, there is no way we should be seeing that message from tests on an Intel Mac, unless some internal state has been corrupted. But the failure is non-deterministic, so this does not happen reliably.
A few samples of the sorts of crashes that one sees while running
make check
, some while usingRUST_LOG
to get more info (I have more complete logs available on request); I had to post process some of the output, replacing raw bytes unacceptable to this text box e.g. \342\200\224 with character strings representing those byte sequences:The text was updated successfully, but these errors were encountered: