Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(from spl/issues/389) -- SPLError: 26788:0:(range_tree.c:172:range_tree_add()) SPL PANIC #2720

Closed
TioNisla opened this issue Sep 19, 2014 · 7 comments

Comments

@TioNisla
Copy link

"rsync -PSAXrltgoD --del" from remote host caused 'SPLError'

full text here: https://gist.github.com/anonymous/39e252399acb6912a16e

P.S.
After removal of storage/samba filesystem, when importing a pool there is this problem, it isn't possible to export or remove a pool: all zfs/zpool related commands just hangs.

@yeoldegrove
Copy link

I can confirm this.
It happens to my system when importing a specific zpool.
After this exception the system itself is still responding, but the pool does not get imported.

My OS happened to crash a few times because of a broken RAM-module.
After removing the broken hardware a "zpool import" is still not possible!
Tried several up-to-date OS (arch, debian, ubuntu) with official zfs packages.

Any chance this gets fixed? Can I contribute?

@behlendorf
Copy link
Contributor

Could you try importing the pool using the latest source from github. There have been several fixes in this area. You can find directions for how to build the latest source at zfsonlinux.org.

@yeoldegrove
Copy link

Build spl,zfs.zfs-utils on my arch box checking out latest github source.
No change.

Details: https://gist.github.com/yeoldegrove/b1d5a83587dbce437c52

@behlendorf
Copy link
Contributor

@yeoldegrove the failure you're seeing indicates that somehow the same address is being freed twice. Can you please rebuild your zfs source with the following patch applied and set the zfs_recover=1 module option. Assuming this is the only problem it should allow you to import the pool and log the offending address so we can sanity check it.

diff --git a/module/zfs/range_tree.c b/module/zfs/range_tree.c
index 4643d26..b5c9222 100644
--- a/module/zfs/range_tree.c
+++ b/module/zfs/range_tree.c
@@ -175,7 +175,7 @@ range_tree_add(void *arg, uint64_t start, uint64_t size)
        rsearch.rs_end = end;
        rs = avl_find(&rt->rt_root, &rsearch, &where);

-       if (rs != NULL && rs->rs_start <= start && rs->rs_end >= end) {
+       if (rs != NULL) {
                zfs_panic_recover("zfs: allocating allocated segment"
                    "(offset=%llu size=%llu)\n",
                    (longlong_t)start, (longlong_t)size);

@yeoldegrove
Copy link

Works like a charm now.
Only messages are...

root@host ~ # dmesg | grep -Ei 'SPL|ZFS'
[ 9.034938] SPL: Loaded module v0.6.3-44_g46c9367
[ 9.062044] ZFS: Loaded module v0.6.3-133_g9635861, ZFS pool version 5000, ZFS filesystem version 5
[ 59.407576] SPL: using hostid 0x640a2900
[ 86.933817] SPLError: 2287:0:(spl-err.c:88:vcmn_err()) WARNING: zfs: allocating allocated segment(offset=870105374720 size=67117056)

The pool now imports and exports in seconds, even across reboots.

I'll go back to the usual upstream code and will report if it still works later.

@yeoldegrove
Copy link

After running the regular packages from arch demz-repo-core repo for two days now, everything seems to run fine.

behlendorf added a commit to behlendorf/zfs that referenced this issue Feb 12, 2015
When a bad DVA is encountered in metaslab_free_dva() the system
should treat it as fatal.  This indicates that somehow a damaged
DVA was written to disk and that should be impossible.

However, we have seen a handful of reports over the years of pools
somehow being damaged in this way.  Since this damage can render
otherwise intact pools unimportable, and the consequence of skipping
the bad DVA is only leaked free space, it makes sense to provide
a mechanism to ignore the bad DVA.  Setting the zfs_recover=1 module
option will cause the DVA to be ignored which may allow the pool to
be imported.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#3090
Issue openzfs#2720
behlendorf added a commit to behlendorf/zfs that referenced this issue Feb 12, 2015
When a bad DVA is encountered in metaslab_free_dva() the system
should treat it as fatal.  This indicates that somehow a damaged
DVA was written to disk and that should be impossible.

However, we have seen a handful of reports over the years of pools
somehow being damaged in this way.  Since this damage can render
otherwise intact pools unimportable, and the consequence of skipping
the bad DVA is only leaked free space, it makes sense to provide
a mechanism to ignore the bad DVA.  Setting the zfs_recover=1 module
option will cause the DVA to be ignored which may allow the pool to
be imported.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#3090
Issue openzfs#2720
behlendorf added a commit to behlendorf/zfs that referenced this issue Feb 14, 2015
When a bad DVA is encountered in metaslab_free_dva() the system
should treat it as fatal.  This indicates that somehow a damaged
DVA was written to disk and that should be impossible.

However, we have seen a handful of reports over the years of pools
somehow being damaged in this way.  Since this damage can render
otherwise intact pools unimportable, and the consequence of skipping
the bad DVA is only leaked free space, it makes sense to provide
a mechanism to ignore the bad DVA.  Setting the zfs_recover=1 module
option will cause the DVA to be ignored which may allow the pool to
be imported.

Since zfs_recover=0 by default any pool attempting to free a bad DVA
will treat it as a fatal error preserving the current behavior.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#3099
Issue openzfs#3090
Issue openzfs#2720
@behlendorf
Copy link
Contributor

A patch for ignoring bad DVAs on blocks which are being freed when zfs_recover=1 is set has been merged. This doesn't get to the root cause of how this could happen but it does provide as more convenient way to recover which has been damaged in this way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants