You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 9, 2019. It is now read-only.
Inside tx.Commit, tx.root.spill() is returning an error, causing tx.rollback() to be called. tx.rollback() calls db.meta() which deferences the DB's meta pages, which causes the segfault. Investigating further, the error returned by tx.root.spill() can be traced back to a call to mmap. The specific error is: "truncate: truncate test.db: There is not enough space on the disk".
My guess is that the failed mmap invalidates db.meta0 and/or db.meta1. The existing data is unmapped before mmap is called, so this seems likely. However, I don't have a good explanation for why this only causes a segfault on Windows. Linux correctly returns an error from the db.Update call, and I believe OS X does as well.
I'm not sure what the best way to handle this is. One idea would be to set a special flag in db if the mmap fails. This is a critical failure state, so there's some justification for handling it specially. The flag would cause tx.rollback() to skip some of its cleanup steps, in particular this call:
In addition, it would be helpful for client programs if the db.Update call returned a special error value (or type) so that they could detect it and decide whether to panic. Even if bolt itself panicked, that would be miles better than a segfault, since a normal panic can at least be caught and converted to a more user-friendly error message.
Let me know if I can test any other potential fixes. In the meantime I'll probably go ahead and implement the fix described above.
The text was updated successfully, but these errors were encountered:
However, I'm not sure if this behavior is safe. Skipping tx.rollback means we don't call freelist.rollback or freelist.reload. Could this cause trouble if the caller repeatedly tries to call Update?
A bigger issue is that, even though a nice error message is returned now, the db file is still unmapped. I haven't tested it, but I assume that in this state it would segfault even if you called db.View. As such, we should probably proceed with the original plan to add a flag to the db type. That way, we could return an error immediately from calls that would otherwise segfault.
Filling the disk with a bolt database causes a segfault on Windows. This script reproduces the bug (tested on Windows 10).
stack trace:
Inside
tx.Commit
,tx.root.spill()
is returning an error, causingtx.rollback()
to be called.tx.rollback()
callsdb.meta()
which deferences the DB's meta pages, which causes the segfault. Investigating further, the error returned bytx.root.spill()
can be traced back to a call tommap
. The specific error is:"truncate: truncate test.db: There is not enough space on the disk"
.My guess is that the failed
mmap
invalidatesdb.meta0
and/ordb.meta1
. The existing data is unmapped beforemmap
is called, so this seems likely. However, I don't have a good explanation for why this only causes a segfault on Windows. Linux correctly returns an error from thedb.Update
call, and I believe OS X does as well.I'm not sure what the best way to handle this is. One idea would be to set a special flag in
db
if themmap
fails. This is a critical failure state, so there's some justification for handling it specially. The flag would causetx.rollback()
to skip some of its cleanup steps, in particular this call:In addition, it would be helpful for client programs if the
db.Update
call returned a special error value (or type) so that they could detect it and decide whether topanic
. Even if bolt itself panicked, that would be miles better than a segfault, since a normal panic can at least be caught and converted to a more user-friendly error message.Let me know if I can test any other potential fixes. In the meantime I'll probably go ahead and implement the fix described above.
The text was updated successfully, but these errors were encountered: