Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"fatal error -- Assertion failed: file "fso_cfscalls2.cc", line 268" during file transfer #38

Open
krichter722 opened this issue Jun 25, 2017 · 1 comment

Comments

@krichter722
Copy link
Contributor

After transferring 7GB of data in approx 100K files venus crashes due to

00:02:39 fatal error -- Assertion failed: file "fso_cfscalls2.cc", line 268

00:02:39 RecovTerminate: clean shutdown
Assertion failed: 0, file "fso_cfscalls2.cc", line 268
***BackTrace***
/usr/sbin/venus(coda_assert+0x76)[0x56525a2bca66]
/usr/sbin/venus(_Z5chokePKciS0_z+0xc8)[0x56525a27b428]
/usr/sbin/venus(_ZN5fsobj7ReleaseEi+0x164)[0x56525a264764]
/usr/sbin/venus(_ZN5fsobj5CloseEij+0x24)[0x56525a2648a4]
/usr/sbin/venus(_ZN5vproc5closeEP11venus_cnodei+0x18b)[0x56525a29de2b]
/usr/sbin/venus(_ZN6worker4mainEv+0xbdd)[0x56525a24919d]
/usr/sbin/venus(_Z13VprocPreamblePv+0xbe)[0x56525a2990ae]
/usr/lib/coda/liblwp.so.2(+0x5d7c)[0x7f719d39bd7c]
/lib/x86_64-linux-gnu/libc.so.6(+0x357f0)[0x7f719c7587f0]
/lib/x86_64-linux-gnu/libc.so.6(sigsuspend+0x16)[0x7f719c758b26]
[0x7ffc3ed9ad00]
Sleeping forever.  You may use gdb to attach to process 8098.

and the backtrace in gdb is

#0  0x00007f719c7f02d0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007f719c7f023a in __sleep (seconds=0, seconds@entry=1) at ../sysdeps/posix/sleep.c:55
#2  0x000056525a2bcaf2 in coda_assert (pred=pred@entry=0x56525a2d3c70 "0", file=file@entry=0x56525a2c3e2d "fso_cfscalls2.cc", line=line@entry=268) at coda_assert.c:66
#3  0x000056525a27b428 in choke (file=file@entry=0x56525a2c3e2d "fso_cfscalls2.cc", line=line@entry=268, fmt=fmt@entry=0x56525a2c0978 "Assertion failed: file \"%s\", line %d\n") at venusutil.cc:208
#4  0x000056525a264764 in fsobj::Release (this=this@entry=0x9b38c810, writep=writep@entry=1) at fso_cfscalls2.cc:268
#5  0x000056525a2648a4 in fsobj::Close (this=0x9b38c810, writep=1, uid=<optimized out>) at fso_cfscalls2.cc:313
#6  0x000056525a29de2b in vproc::close (this=this@entry=0x56525b2dba80, cp=cp@entry=0x15175930, flags=3) at vproc_vfscalls.cc:264
#7  0x000056525a24919d in worker::main (this=0x56525b2dba80) at worker.cc:1205
#8  0x000056525a2990ae in VprocPreamble (arg=arg@entry=0x56525b2dbb08) at vproc.cc:152
#9  0x00007f719d39bd7c in _thread (sig=<optimized out>) at lwp_ucontext.c:91
#10 <signal handler called>
#11 0x00007f719c758b26 in __GI___sigsuspend (set=0x7ffc3ed9abb0) at ../sysdeps/unix/sysv/linux/sigsuspend.c:30
#12 0x00007ffc3ed9ad00 in ?? ()
#13 0x00007ffc3ed9ad00 in ?? ()
#14 0x00007ffc3ed9ad01 in ?? ()
#15 0x00007ffc3ed9adee in ?? ()
#16 0x00007ffc3ed9ad00 in ?? ()
#17 0x00007ffc3ed9adee in ?? ()
#18 0x0000000000000000 in ?? ()

experienced with 6.11.2-1+ubuntu16.10 on Ubuntu 16.10

@jaharkes
Copy link
Member

The assertion should trigger because at that point we are trying to free a file object, but it still has 'open' references and we should therefore not have gotten to this point.

So there is either a reference count leak somewhere, or a race condition. 7GB of data over 100k files isn't all that much, I actually read walked the entire coda.cs.cmu.edu tree last week which has quite a few more than that, but that was a read-only action and I was only checking for conflicts so probably not a whole lot of 'open/close' calls on the files (and only for reading it at all).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants