Skip to content

futex returns ERESTART_RESTARTBLOCK on Ubuntu 9.10 under VMware #420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gopherbot opened this issue Dec 13, 2009 · 13 comments
Closed

futex returns ERESTART_RESTARTBLOCK on Ubuntu 9.10 under VMware #420

gopherbot opened this issue Dec 13, 2009 · 13 comments

Comments

@gopherbot
Copy link
Contributor

by manuel.klimek:

Before filing a bug, please check whether it has been fixed since
the latest release: run "hg pull -u" and retry what you did to
reproduce the problem.  Thanks.

What steps will reproduce the problem?
1. build go on kubuntu 9.10 64 bit running on vmware on top of Windows 7
2. go/src/pkg/sync$ while ./6.out ; do true; done

What is the expected output? What do you see instead?
See the attached output.

What is your $GOOS?  $GOARCH?
$ echo $GOOS
linux
$ echo $GOARCH
amd64

Which revision are you using?  (hg identify)
$ hg identify
8f27dd511198 tip

Please provide any additional information below.
The futex kernel call (http://www.google.com/codesearch/p?
hl=en#aWnupsccpa8/trunk/linux-
2.6.31/kernel/futex.c&q=file:futex.c%202.6.31&sa=N&cd=1&ct=rc&l=1802)

returns ERESTART_RESTARTBLOCK.
According to http://www.google.com/codesearch/p?
hl=en#aWnupsccpa8/trunk/linux-
2.6.31/include/linux/errno.h&q=file:futex.c%202.6.31&d=2 this should never 
be encountered by userspace. I didn't understand enough of what's going on 
to be able to decide whether that's a kernel or go bug, so I thought I'd 
let you decide...

Attachments:

  1. go-segfault-futex (1055 bytes)
@rsc
Copy link
Contributor

rsc commented Dec 14, 2009

Comment 1:

Interesting.  It certainly looks like that's not 
supposed to happen.  I think it's a kernel bug
but I'll ask around and see if I can get someone
to confirm that.

Labels changed: added helpwanted, os-linux.

Owner changed to r...@golang.org.

Status changed to Thinking.

@rsc
Copy link
Contributor

rsc commented Dec 14, 2009

Comment 3:

A few more questions about your setup.
Could you please send the output of
uname -a
cat /etc/lsb-release
cat /proc/cpuinfo
?
Did you have $GOMAXPROCS set explicitly?
How long does the loop typically run before the crash?
I'm running on a 64-bit Ubuntu 9.10 machine myself
(real hardware, not VMware), and I've tried your loop 
but haven't seen a crash yet.  My setup is
c2=; uname -a
Linux c2 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:05:01 UTC 2009 x86_64 GNU/Linux
c2=; cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=9.10
DISTRIB_CODENAME=karmic
DISTRIB_DESCRIPTION="Ubuntu 9.10"
c2=; cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Core(TM)2 Duo CPU     E7200  @ 2.53GHz
stepping    : 6
cpu MHz     : 1600.000
cache size  : 3072 KB
physical id : 0
siblings    : 2
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 10
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush 
dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts
rep_good 
pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm
bogomips    : 5067.72
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:
processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Core(TM)2 Duo CPU     E7200  @ 2.53GHz
stepping    : 6
cpu MHz     : 1600.000
cache size  : 3072 KB
physical id : 0
siblings    : 2
core id     : 1
cpu cores   : 2
apicid      : 1
initial apicid  : 1
fpu     : yes
fpu_exception   : yes
cpuid level : 10
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush 
dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts
rep_good 
pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm
bogomips    : 5066.80
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:
c2=;

Status changed to WaitingForReply.

@rsc
Copy link
Contributor

rsc commented Dec 14, 2009

Comment 4:

I ran your loop for 11 hours on my machine and could not reproduce this.
(I'm still looking forward to seeing more details about your machine.)
Thanks.

@r4nt
Copy link

r4nt commented Dec 14, 2009

Comment 5:

I don't think you'll be able to reproduce outside of vmware - I set up a VirtualBox 
image, and, while VirtualBox is a lot slower, I haven't seen it crash so far.
It usually crashes every 5th run or so on vmware.
I haven't set any other GO-environment variables.
Linux klimek-desktop 2.6.31-16-generic #53-Ubuntu SMP Tue Dec 8 04:02:15 UTC 2009 
x86_64 GNU/Linux
DISTRIB_ID=Ubuntu                                                                       
         
DISTRIB_RELEASE=9.10                                                                    
         
DISTRIB_CODENAME=karmic                                                                 
         
DISTRIB_DESCRIPTION="Ubuntu 9.10"                                                       
         
processor       : 0                                                                     
         
vendor_id       : GenuineIntel                                                          
         
cpu family      : 6                                                                     
         
model           : 15                                                                    
         
model name      : Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz                       
         
stepping        : 11                                                                    
         
cpu MHz         : 2405.500                                                              
         
cache size      : 4096 KB                                                               
         
fpu             : yes                                                                   
         
fpu_exception   : yes                                                                   
         
cpuid level     : 10                                                                    
         
wp              : yes                                                                   
         
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat 
pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx lm constant_tsc arch_perfmon 
pebs bts rep_good tsc_reliable pni ssse3 cx16 hypervisor lahf_lm                        
       
bogomips        : 4811.00                                                               
                                           
clflush size    : 64                                                                    
                                           
cache_alignment : 64                                                                    
                                           
address sizes   : 40 bits physical, 48 bits virtual                                     
                                           
power management:                                                                       
                                           
processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6           
model           : 15          
model name      : Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz
stepping        : 11                                             
cpu MHz         : 2405.500                                       
cache size      : 4096 KB                                        
fpu             : yes                                            
fpu_exception   : yes                                            
cpuid level     : 10                                             
wp              : yes                                            
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat 
pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx lm constant_tsc arch_perfmon 
pebs bts rep_good tsc_reliable pni ssse3 cx16 hypervisor lahf_lm                        
       
bogomips        : 4811.00                                                               
                                           
clflush size    : 64                                                                    
                                           
cache_alignment : 64                                                                    
                                           
address sizes   : 40 bits physical, 48 bits virtual                                     
                                           
power management:
processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz
stepping        : 11
cpu MHz         : 2405.500
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat 
pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx lm constant_tsc arch_perfmon 
pebs bts rep_good tsc_reliable pni ssse3 cx16 hypervisor lahf_lm
bogomips        : 4733.69
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz
stepping        : 11
cpu MHz         : 2405.500
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat 
pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx lm constant_tsc arch_perfmon 
pebs bts rep_good tsc_reliable pni ssse3 cx16 hypervisor lahf_lm
bogomips        : 4810.09
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

@rsc
Copy link
Contributor

rsc commented Dec 16, 2009

Comment 6:

I talked to some kernel hackers here, and basically this kind of
thing happens because the kernel is a little too clever in the
way it handles restarting system calls after signals.  What the
kernel does can be made correct, but bugs in other code 
(basically anything that mishandles TIF_SIGPENDING)
can trigger this subtle problem.  Does the bug show itself
if you unload all the VMware-specific modules from the kernel?

@rsc
Copy link
Contributor

rsc commented Dec 16, 2009

Comment 7:

Labels changed: added expertneeded, removed helpwanted.

@rsc
Copy link
Contributor

rsc commented Dec 17, 2009

Comment 8:

http://groups.google.com/group/golang-
nuts/browse_thread/thread/c3a554fde3486918
Another report, this one using VirtualBox.

@rsc
Copy link
Contributor

rsc commented Jan 31, 2010

Comment 9:

Issue #578 has been merged into this issue.

@rsc
Copy link
Contributor

rsc commented Jan 31, 2010

Comment 10:

Status changed to HelpWanted.

@rsc
Copy link
Contributor

rsc commented Feb 6, 2010

Comment 11:

Looks like libpthread avoids this problem by ignoring the return value from futex.
I'm not convinced that's better.  http://bit.ly/bywwwL

@hoisie
Copy link
Contributor

hoisie commented Feb 21, 2010

Comment 12:

Just ran into this problem on a Slicehost node, where I have a web application 
running with 30 goroutines. 
I have a stack trace here:
http://hoisie.com/freshstatus.txt
The system is CentOS 5.4 running a 2.6.31 kernel. I'll try to downgrade the Kernel 
and see if it comes up.

@rsc
Copy link
Contributor

rsc commented Feb 23, 2010

Comment 13:

This issue was closed by revision 8ba5c55.

Status changed to Fixed.

Merged into issue #-.

@rsc
Copy link
Contributor

rsc commented Sep 11, 2010

Comment 14:

Issue #480 has been merged into this issue.

This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants