(from spl/issues/389) -- SPLError: 26788:0:(range_tree.c:172:range_tree_add()) SPL PANIC #2720

TioNisla · 2014-09-19T03:44:46Z

"rsync -PSAXrltgoD --del" from remote host caused 'SPLError'

full text here: https://gist.github.com/anonymous/39e252399acb6912a16e

P.S.
After removal of storage/samba filesystem, when importing a pool there is this problem, it isn't possible to export or remove a pool: all zfs/zpool related commands just hangs.

yeoldegrove · 2014-10-27T19:33:47Z

I can confirm this.
It happens to my system when importing a specific zpool.
After this exception the system itself is still responding, but the pool does not get imported.

My OS happened to crash a few times because of a broken RAM-module.
After removing the broken hardware a "zpool import" is still not possible!
Tried several up-to-date OS (arch, debian, ubuntu) with official zfs packages.

Any chance this gets fixed? Can I contribute?

behlendorf · 2014-10-27T20:33:30Z

Could you try importing the pool using the latest source from github. There have been several fixes in this area. You can find directions for how to build the latest source at zfsonlinux.org.

yeoldegrove · 2014-10-27T21:53:02Z

Build spl,zfs.zfs-utils on my arch box checking out latest github source.
No change.

Details: https://gist.github.com/yeoldegrove/b1d5a83587dbce437c52

behlendorf · 2014-10-27T23:30:17Z

@yeoldegrove the failure you're seeing indicates that somehow the same address is being freed twice. Can you please rebuild your zfs source with the following patch applied and set the zfs_recover=1 module option. Assuming this is the only problem it should allow you to import the pool and log the offending address so we can sanity check it.

diff --git a/module/zfs/range_tree.c b/module/zfs/range_tree.c
index 4643d26..b5c9222 100644
--- a/module/zfs/range_tree.c
+++ b/module/zfs/range_tree.c
@@ -175,7 +175,7 @@ range_tree_add(void *arg, uint64_t start, uint64_t size)
        rsearch.rs_end = end;
        rs = avl_find(&rt->rt_root, &rsearch, &where);

-       if (rs != NULL && rs->rs_start <= start && rs->rs_end >= end) {
+       if (rs != NULL) {
                zfs_panic_recover("zfs: allocating allocated segment"
                    "(offset=%llu size=%llu)\n",
                    (longlong_t)start, (longlong_t)size);

yeoldegrove · 2014-10-28T07:23:36Z

Works like a charm now.
Only messages are...

root@host ~ # dmesg | grep -Ei 'SPL|ZFS'
[ 9.034938] SPL: Loaded module v0.6.3-44_g46c9367
[ 9.062044] ZFS: Loaded module v0.6.3-133_g9635861, ZFS pool version 5000, ZFS filesystem version 5
[ 59.407576] SPL: using hostid 0x640a2900
[ 86.933817] SPLError: 2287:0:(spl-err.c:88:vcmn_err()) WARNING: zfs: allocating allocated segment(offset=870105374720 size=67117056)

The pool now imports and exports in seconds, even across reboots.

I'll go back to the usual upstream code and will report if it still works later.

yeoldegrove · 2014-10-29T19:18:40Z

After running the regular packages from arch demz-repo-core repo for two days now, everything seems to run fine.

When a bad DVA is encountered in metaslab_free_dva() the system should treat it as fatal. This indicates that somehow a damaged DVA was written to disk and that should be impossible. However, we have seen a handful of reports over the years of pools somehow being damaged in this way. Since this damage can render otherwise intact pools unimportable, and the consequence of skipping the bad DVA is only leaked free space, it makes sense to provide a mechanism to ignore the bad DVA. Setting the zfs_recover=1 module option will cause the DVA to be ignored which may allow the pool to be imported. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#3090 Issue openzfs#2720

When a bad DVA is encountered in metaslab_free_dva() the system should treat it as fatal. This indicates that somehow a damaged DVA was written to disk and that should be impossible. However, we have seen a handful of reports over the years of pools somehow being damaged in this way. Since this damage can render otherwise intact pools unimportable, and the consequence of skipping the bad DVA is only leaked free space, it makes sense to provide a mechanism to ignore the bad DVA. Setting the zfs_recover=1 module option will cause the DVA to be ignored which may allow the pool to be imported. Since zfs_recover=0 by default any pool attempting to free a bad DVA will treat it as a fatal error preserving the current behavior. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#3099 Issue openzfs#3090 Issue openzfs#2720

behlendorf · 2015-02-14T00:10:15Z

A patch for ignoring bad DVAs on blocks which are being freed when zfs_recover=1 is set has been merged. This doesn't get to the root cause of how this could happen but it does provide as more convenient way to recover which has been damaged in this way.

TioNisla mentioned this issue Sep 19, 2014

SPLError: 26788:0:(range_tree.c:172:range_tree_add()) SPL PANIC openzfs/spl#389

Closed

behlendorf added the Bug label Sep 22, 2014

behlendorf added Bug - Minor and removed Bug labels Oct 27, 2014

SenH mentioned this issue Feb 12, 2015

zpool import panic #3090

Closed

behlendorf mentioned this issue Feb 12, 2015

Skip bad DVAs during free by setting zfs_recover=1 #3099

Closed

behlendorf closed this as completed Feb 14, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(from spl/issues/389) -- SPLError: 26788:0:(range_tree.c:172:range_tree_add()) SPL PANIC #2720

(from spl/issues/389) -- SPLError: 26788:0:(range_tree.c:172:range_tree_add()) SPL PANIC #2720

TioNisla commented Sep 19, 2014

yeoldegrove commented Oct 27, 2014

behlendorf commented Oct 27, 2014

yeoldegrove commented Oct 27, 2014

behlendorf commented Oct 27, 2014

yeoldegrove commented Oct 28, 2014

yeoldegrove commented Oct 29, 2014

behlendorf commented Feb 14, 2015

(from spl/issues/389) -- SPLError: 26788:0:(range_tree.c:172:range_tree_add()) SPL PANIC #2720

(from spl/issues/389) -- SPLError: 26788:0:(range_tree.c:172:range_tree_add()) SPL PANIC #2720

Comments

TioNisla commented Sep 19, 2014

yeoldegrove commented Oct 27, 2014

behlendorf commented Oct 27, 2014

yeoldegrove commented Oct 27, 2014

behlendorf commented Oct 27, 2014

yeoldegrove commented Oct 28, 2014

yeoldegrove commented Oct 29, 2014

behlendorf commented Feb 14, 2015