-
Notifications
You must be signed in to change notification settings - Fork 18k
futex returns ERESTART_RESTARTBLOCK on Ubuntu 9.10 under VMware #420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
Interesting. It certainly looks like that's not supposed to happen. I think it's a kernel bug but I'll ask around and see if I can get someone to confirm that. Labels changed: added helpwanted, os-linux. Owner changed to r...@golang.org. Status changed to Thinking. |
A few more questions about your setup. Could you please send the output of uname -a cat /etc/lsb-release cat /proc/cpuinfo ? Did you have $GOMAXPROCS set explicitly? How long does the loop typically run before the crash? I'm running on a 64-bit Ubuntu 9.10 machine myself (real hardware, not VMware), and I've tried your loop but haven't seen a crash yet. My setup is c2=; uname -a Linux c2 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:05:01 UTC 2009 x86_64 GNU/Linux c2=; cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=9.10 DISTRIB_CODENAME=karmic DISTRIB_DESCRIPTION="Ubuntu 9.10" c2=; cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Duo CPU E7200 @ 2.53GHz stepping : 6 cpu MHz : 1600.000 cache size : 3072 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm bogomips : 5067.72 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Duo CPU E7200 @ 2.53GHz stepping : 6 cpu MHz : 1600.000 cache size : 3072 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm bogomips : 5066.80 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: c2=; Status changed to WaitingForReply. |
I don't think you'll be able to reproduce outside of vmware - I set up a VirtualBox image, and, while VirtualBox is a lot slower, I haven't seen it crash so far. It usually crashes every 5th run or so on vmware. I haven't set any other GO-environment variables. Linux klimek-desktop 2.6.31-16-generic #53-Ubuntu SMP Tue Dec 8 04:02:15 UTC 2009 x86_64 GNU/Linux DISTRIB_ID=Ubuntu DISTRIB_RELEASE=9.10 DISTRIB_CODENAME=karmic DISTRIB_DESCRIPTION="Ubuntu 9.10" processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz stepping : 11 cpu MHz : 2405.500 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx lm constant_tsc arch_perfmon pebs bts rep_good tsc_reliable pni ssse3 cx16 hypervisor lahf_lm bogomips : 4811.00 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz stepping : 11 cpu MHz : 2405.500 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx lm constant_tsc arch_perfmon pebs bts rep_good tsc_reliable pni ssse3 cx16 hypervisor lahf_lm bogomips : 4811.00 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz stepping : 11 cpu MHz : 2405.500 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx lm constant_tsc arch_perfmon pebs bts rep_good tsc_reliable pni ssse3 cx16 hypervisor lahf_lm bogomips : 4733.69 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz stepping : 11 cpu MHz : 2405.500 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx lm constant_tsc arch_perfmon pebs bts rep_good tsc_reliable pni ssse3 cx16 hypervisor lahf_lm bogomips : 4810.09 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: |
I talked to some kernel hackers here, and basically this kind of thing happens because the kernel is a little too clever in the way it handles restarting system calls after signals. What the kernel does can be made correct, but bugs in other code (basically anything that mishandles TIF_SIGPENDING) can trigger this subtle problem. Does the bug show itself if you unload all the VMware-specific modules from the kernel? |
http://groups.google.com/group/golang- nuts/browse_thread/thread/c3a554fde3486918 Another report, this one using VirtualBox. |
Issue #578 has been merged into this issue. |
Looks like libpthread avoids this problem by ignoring the return value from futex. I'm not convinced that's better. http://bit.ly/bywwwL |
Just ran into this problem on a Slicehost node, where I have a web application running with 30 goroutines. I have a stack trace here: http://hoisie.com/freshstatus.txt The system is CentOS 5.4 running a 2.6.31 kernel. I'll try to downgrade the Kernel and see if it comes up. |
This issue was closed by revision 8ba5c55. Status changed to Fixed. Merged into issue #-. |
Issue #480 has been merged into this issue. |
This was referenced Dec 8, 2014
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
by manuel.klimek:
Attachments:
The text was updated successfully, but these errors were encountered: