-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arc_adapt hangs in "D" state while reading data from a snapshot #1215
Comments
Also kernel log and dump from from a second machine, daily 0.6.0.92-0ubuntu1~precise1 on ubuntu 3.2.0-35-generic http://dl.transfer.ro/transfer_ro-17jan-1cd8a1f9976c0341.zip |
openzfs/spl#97 reference link to original issue with more debug data |
@mailinglists35 Can you apply the above debug patch to the SPL and try to reproduce the issue? If you hit the bug please post console output to this issue. Thanks |
Hi, I downloaded: I compiled spl ok, but zfs refuses to compile here: make[3]: Entering directory `/usr/src/linux-headers-3.2.0-4-amd64' |
Update ZFS to the latest master source. That symbol was recently removed from the SPL because the latest ZFS version doesn't use it. |
ran it twice just to make sure |
the first call trace happens when pressing TAB key in bash to complete an .zfs/sapshot/path |
Exact console output: root@homerouter:~# find /mnt/seagate910/homerouter/.zfs/snapshot/zfs-auto-snap_daily-2012-12-25-0640/ 2>&1 > /dev/null Message from syslogd@homerouter at Jan 21 04:40:09 ... Message from syslogd@homerouter at Jan 21 04:40:09 ... Message from syslogd@homerouter at Jan 21 04:40:09 ... Message from syslogd@homerouter at Jan 21 04:40:09 ... Message from syslogd@homerouter at Jan 21 04:40:09 ... root@homerouter:~# screendump |
If destroying a held lock, log who's holding it before panicking. If releasing a lock that's being destroyed, dump a stack trace. This should tell us something about what the lock holder is doing when the lock is destroyed.
@mailinglists35 Sorry for the buggy debug patch. I'm guessing it tried to dereference a NULL |
root@homerouter:~# find /mnt/seagate910/homerouter/.zfs/snapshot/zfs-auto-snap_hourly-2013-01-20-1517/ 2>&1 > /dev/null Message from syslogd@homerouter at Jan 21 21:09:30 ... Message from syslogd@homerouter at Jan 21 21:09:30 ... If I uploaded the spl dump, http://dl.transfer.ro/transfer_ro-21jan-95eb2d56ac.zip is the kernel stack trace still needed? |
Thanks for the testing. I was hoping to identify which process was holding the lock being freed, but your results suggest no one is in fact holding it. Perhaps we have a race in the test itself. |
Can you post the output of
|
|
Also can you post the |
So a NULL 1245 int
1246 fzap_cursor_move_to_key(zap_cursor_t *zc, zap_name_t *zn)
1247 {
1248 int err;
1249 zap_leaf_t *l;
1250 zap_entry_handle_t zeh;
1251
1252 if (zn->zn_key_orig_numints * zn->zn_key_intlen > ZAP_MAXNAMELEN)
1253 return (ENAMETOOLONG);
1254
1255 err = zap_deref_leaf(zc->zc_zap, zn->zn_hash, NULL, RW_READER, &l);
1256 if (err != 0)
1257 return (err);
1258
/***************** Now holding l->l_rwlock as reader via
***************** zap_deref_leaf->zap_get_leaf_byblk->rw_enter */
1259 err = zap_leaf_lookup(l, zn, &zeh);
1260 if (err != 0)
1261 return (err); /* Missing zap_put_leaf(l) ? */
1262
1263 zc->zc_leaf = l;
1264 zc->zc_hash = zeh.zeh_hash;
1265 zc->zc_cd = zeh.zeh_cd;
1266
1267 return (err);
1268 }
1269 Have no idea if this happens in the case of this bug, but if there is in fact a missing put here we could fix it and see if it helps. |
@nedbass This does look like a real bug but it's unclear to me how it could be a problem in this context. The Also by my reading of this code we should be dropping the |
@behlendorf thanks for the analysis. Note that zfsctl_snapdir_lookup() calls dmu_snapshot_id() which calls zap_cursor_move_to_key(). |
Ahh, somehow I missed that call site. Your right. Then this is a very good explanation for the issue, this is all ZoL specific code. I'd expect that dropping the lock as described above should resolve the issue. |
@mailinglists35 please try the above patch. Thanks |
thanks. I might be rushing to report, but apparently the patch resolved the issue. however now I get this when first accessing the .zfs/snapshot directory (pressing tab autocomplete in bash)
|
That is #1230, a regression recently introduced into master. I believe it should be harmless aside from the scary backtrace, but a fix for it should land soon. Glad to hear the initial test results are good. |
tested the second box, spl no longer panics. I have started a read of all snapshots overnight on both machines and post update when finished, but as far as it seems the bug is gone. where do you want the beer delivered? :) |
Callers of zap_deref_leaf() must be careful to drop leaf->l_rwlock since that function returns with the lock held on success. All other callers drop the lock correctly but it seems fzap_cursor_move_to_key() does not. This may block writers or cause VERIFY failures when the lock is freed. Fixes openzfs#1215 Fixes openzfs/spl#143 Fixes openzfs/spl#97
I agree that fzap_cursor_move_to_key() should drop the lock. However, I believe that this routine is being used incorrectly, and should be removed. dmu_snapshot_id() should return the ID of the named snapshot. However, if the snapshot does not exist, it returns a different snapshot's ID. Instead, we should use dsl_dataset_snap_lookup(). |
I let this run overnight on both machines: one machine is now with arc_adapt running 100% cpu and on the other rsync is blocked in D state and no other zfs commands complete and some kernel warnings on dmesg. will post details when I get some time. |
@ahrens would it not be a layering violation to use dsl_dataset_snap_lookup() from the POSIX layer? The ZFS architectural diagram shows the ZPL accessing DSL indirectly through the ZAP and DMU layers: Also, I'm not sure I understand your comment about dmu_snapshot_id() returning a different snapshot's ID. It looks to me like it returns ENOENT and leaves |
@nedbass dmu_snapshot_id() moves the cursor to the given snapshot name, then it retrieves the cursor. If the snapshot is deleted after the call to zap_cursor_move_to_key() but before the call to zap_cursor_retrieve(), the entry you expect will not exist when zap_cursor_retrieve() is called, so it will return the next entry. As far as layering goes, you could make dmu_snapshot_id() be a one-liner: return (dsl_dataset_snap_lookup(dmu_objset_ds(os), snapname, idp)); Although personally I'd just inline that into the one caller. The layering diagram is idealized, and I don't think it really hurts anything for the ZPL to pass the dataset to the DSL (as evidenced by many other uses of dmu_objset_ds() in the ZPL and ZIL). |
@ahrens Thanks for the helpful explanation. |
@ahrens I know this functionality was originally added for Lustre, but after looking at the latest Lustre source I don't see any But you raise a good point about It looks like OpenSolaris's @mailinglist35 If you can post stack traces we can identify if this are different issues. |
stack traces: oh, maybe this helps, there are some nonstandard kernel values on each box:
|
@mailinglists35 The first log (homerouter) looks like #1224, so probably unrelated to this bug. As to the second log (mailhost) I guess the arc_adapt is busy trying to reclaim cached metadata, similar to the report in #790. Again it looks unrelated to this bug, but you could try tuning |
@maxximino I thought that at first too, but the box was rebooted since then (around line 4934) |
@nedbass apologies, I forgot to rotate the logs first. |
thanks everyone. I'll wait for the patch to be merged & the ppa package updated and then move on to other suggested issues |
Callers of zap_deref_leaf() must be careful to drop leaf->l_rwlock since that function returns with the lock held on success. All other callers drop the lock correctly but it seems fzap_cursor_move_to_key() does not. This may block writers or cause VERIFY failures when the lock is freed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1215 Closes openzfs/spl#143 Closes openzfs/spl#97
The remaining dmu_snapshot_id() change has been filed as #1238. |
Callers of zap_deref_leaf() must be careful to drop leaf->l_rwlock since that function returns with the lock held on success. All other callers drop the lock correctly but it seems fzap_cursor_move_to_key() does not. This may block writers or cause VERIFY failures when the lock is freed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#1215 Closes openzfs/spl#143 Closes openzfs/spl#97
Retire the dmu_snapshot_id() function which was introduced in the initial .zfs control directory implementation. There is already an existing dsl_dataset_snap_lookup() which does exactly what we need, and the dmu_snapshot_id() function as implemented is racy. openzfs#1215 (comment) Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#1238
Retire the dmu_snapshot_id() function which was introduced in the initial .zfs control directory implementation. There is already an existing dsl_dataset_snap_lookup() which does exactly what we need, and the dmu_snapshot_id() function as implemented is racy. openzfs/zfs#1215 (comment) Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1238
… misses livelist blocks (openzfs#1215) PR URL: https://www.github.com/delphix/zfs/pull/1215
I get this combination of "VERIFY(!RW_LOCK_HELD(&l->l_rwlock)) failed" + "zap.c" + "arc_adapt" remaining in D state every time I try to read significant amount data from a snapshot, on every version I've installed since half a year ago or so.
To continue using pool/fs I must hard reboot the machine or shortcut via "echo b > /proc/sysrq-trigger"
Reposting latest kernel log, echo t > /proc/sysrq-trigger and spl dump. I can provide ssh if needed or run debugging version etc
As logs says, running daily ppa 0.6.0.92-0ubuntu1~oneiric1 on debian 3.2.0-4-amd64
http://pastebin.com/nQJ4Kcup
http://dl.transfer.ro/transfer_ro-17jan-2bae4ca17cf4.zip
The text was updated successfully, but these errors were encountered: