Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EBUSY upon ZPOOL_EXPORT #1045

Closed
stephane-chazelas opened this issue Oct 15, 2012 · 4 comments
Closed

EBUSY upon ZPOOL_EXPORT #1045

stephane-chazelas opened this issue Oct 15, 2012 · 4 comments
Milestone

Comments

@stephane-chazelas
Copy link

I got a second occurrence of the issue described at http://thread.gmane.org/gmane.linux.file-systems.zfs.user/4661

I've been doing an "offsite backup" every week, whereby I zfs-send|zfs-recv a number of datasets from one zpool onto another zpool on a pair of hard drives (well luks devices on top of hard drives). I do a zfs export, luksClose before taking the drives offsite.

Today, for some reason, the zfs export fails with:

zpool export offsite-backup-05
cannot export 'offsite-backup-05': pool is busy

There is no zfs command running, nothing mounted (zpool export managed to do that part) on there (checked /proc/mounts as well), nothing uses the zvols in there, no loop device or anything. I've tried to killall -STOP udevd in case it was somehow accessing stuff while the export was trying to tidy them away.

I've got a sysrq-t output, not sure what to look for to see what may be holding it.

Trying to "zfs mount -a" to see if I can mount it back, it says for every mount point:

filesystem 'offsite-backup-05/main/servers/skywalker/shadow_nbd/c' is already mounted cannot mount 'offsite-backup-05/main/servers/skywalker/shadow_nbd/c': Resource temporarily unavailable

While "grep offsite-backup-05 /proc/mounts" returns nothing.

So there's something definitely going wrong there.

I can still read the zvols on there, though.

I have the zevents going to the console (zfs_zevent_console=1) and there has been nothing (no IO error, no nothing, I used to get a lot of oops, but since upgrading the memory to 48GB, it has been quite stable until now).

Before rebooting, I also tried to export the other zpool (the one I was "zfs send"ing from) and got the same EBUSY error (succesful umount but EBUSY upon the ioctl(ZPOOL_EXPORT) as for the other one).

I noticed (in top) an arc_adapt taking 100% of 1 CPU. Running a sysrq-l a few times showed each time it being in:

Pid: 477, comm: arc_adapt Tainted: P           O 3.2.0-29-generic #46-Ubuntu Dell Inc. PowerEdge R515/03X0MN
RIP: 0010:[<ffffffff81179f4d>]  [<ffffffff81179f4d>] __put_super+0x6d/0x80
RSP: 0018:ffff8806470b5dc0  EFLAGS: 00000202
RAX: 0000000000000001 RBX: ffff880ab13b9c00 RCX: 0000000000000001
RDX: 000000000000bec5 RSI: ffff880653a41700 RDI: ffff880ab13b9c00
RBP: ffff8806470b5dd0 R08: ffff8806470b4000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff880ab13b9c00
R13: ffffffffa0214850 R14: ffffffff81f03c20 R15: ffffffffa01e7f20
FS:  00007f235cb4b700(0000) GS:ffff880c7f600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000207f6d0 CR3: 0000000001c05000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process arc_adapt (pid: 477, threadinfo ffff8806470b4000, task ffff880653a41700)
Stack:
 ffff8806470b5dd0 ffff880ab13b9c00 ffff8806470b5e20 ffffffff8117a0d7
 ffff8806470b5e38 ffff880ab13b9c68 ffffffffa02197e0 ffff8806470a5740
 0000000000000000 ffff8806470a5760 ffffffffa01e7f50 ffffffffffffffff
Call Trace:
 [<ffffffff8117a0d7>] iterate_supers_type+0xa7/0xe0
 [<ffffffffa01e7f50>] ? zpl_prune_sb+0x30/0x30 [zfs]
 [<ffffffffa01e7f8f>] zpl_prune_sbs+0x3f/0x50 [zfs]
 [<ffffffffa01489b1>] arc_adjust_meta+0x121/0x1e0 [zfs]
 [<ffffffffa0148a70>] ? arc_adjust_meta+0x1e0/0x1e0 [zfs]
 [<ffffffffa0148a70>] ? arc_adjust_meta+0x1e0/0x1e0 [zfs]
 [<ffffffffa0148ada>] arc_adapt_thread+0x6a/0xd0 [zfs]
 [<ffffffffa00830b8>] thread_generic_wrapper+0x78/0x90 [spl]
 [<ffffffffa0083040>] ? __thread_create+0x310/0x310 [spl]
 [<ffffffff81089fbc>] kthread+0x8c/0xa0
 [<ffffffff81664034>] kernel_thread_helper+0x4/0x10
 [<ffffffff81089f30>] ? flush_kthread_worker+0xa0/0xa0
 [<ffffffff81664030>] ? gs_change+0x13/0x13

In case that talks to anybody.

@dechamps
Copy link
Contributor

Your call trace matches #861.

@behlendorf
Copy link
Contributor

Right this looks like a duplicate of #861.

@stephane-chazelas
Copy link
Author

Well, it is different in that I don't get any "rcu_sched detected stall", the umount returns fine, the export doesn't hang but returns with EBUSY, but indeed they look similar (and to #790).

Any recommendation on what I should try and do the next time it happens?

@behlendorf
Copy link
Contributor

Once it happens there's nothing really which can be done. We needs to happen is for us to identify the exact flaw and see if/how it can be worked around and then properly fixed.

behlendorf added a commit to behlendorf/zfs that referenced this issue Jul 16, 2013
The iterate_supers_type() function which was introduced in the
3.0 kernel was supposed to provide a safe way to call an arbitrary
function on all super blocks of a specific type.  Unfortunately,
because a list_head was used a bug was introduced which made it
possible for iterate_supers_type() to get stuck spinning on a
super block which was just deactivated.

The bug was fixed in the 3.3 kernel by converting the list_head
to an hlist_node.  However, to resolve the issue for existing
3.0 - 3.2 kernels we detect when a list_head is used.  Then to
prevent the spinning from occurring the .next pointer is set to
the fs_supers list_head which ensures the iterate_supers_type()
function will always terminate.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#1045
Issue openzfs#861
Issue openzfs#790
unya pushed a commit to unya/zfs that referenced this issue Dec 13, 2013
The iterate_supers_type() function which was introduced in the
3.0 kernel was supposed to provide a safe way to call an arbitrary
function on all super blocks of a specific type.  Unfortunately,
because a list_head was used a bug was introduced which made it
possible for iterate_supers_type() to get stuck spinning on a
super block which was just deactivated.

This can occur because when the list head is removed from the
fs_supers list it is reinitialized to point to itself.  If the
iterate_supers_type() function happened to be processing the
removed list_head it will get stuck spinning on that list_head.

The bug was fixed in the 3.3 kernel by converting the list_head
to an hlist_node.  However, to resolve the issue for existing
3.0 - 3.2 kernels we detect when a list_head is used.  Then to
prevent the spinning from occurring the .next pointer is set to
the fs_supers list_head which ensures the iterate_supers_type()
function will always terminate.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#1045
Closes openzfs#861
Closes openzfs#790
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023
…enzfs#1045)

---
updated-dependencies:
- dependency-name: unicode-ident
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants