-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EBUSY upon ZPOOL_EXPORT #1045
Comments
Your call trace matches #861. |
Right this looks like a duplicate of #861. |
Well, it is different in that I don't get any "rcu_sched detected stall", the umount returns fine, the export doesn't hang but returns with EBUSY, but indeed they look similar (and to #790). Any recommendation on what I should try and do the next time it happens? |
Once it happens there's nothing really which can be done. We needs to happen is for us to identify the exact flaw and see if/how it can be worked around and then properly fixed. |
The iterate_supers_type() function which was introduced in the 3.0 kernel was supposed to provide a safe way to call an arbitrary function on all super blocks of a specific type. Unfortunately, because a list_head was used a bug was introduced which made it possible for iterate_supers_type() to get stuck spinning on a super block which was just deactivated. The bug was fixed in the 3.3 kernel by converting the list_head to an hlist_node. However, to resolve the issue for existing 3.0 - 3.2 kernels we detect when a list_head is used. Then to prevent the spinning from occurring the .next pointer is set to the fs_supers list_head which ensures the iterate_supers_type() function will always terminate. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#1045 Issue openzfs#861 Issue openzfs#790
The iterate_supers_type() function which was introduced in the 3.0 kernel was supposed to provide a safe way to call an arbitrary function on all super blocks of a specific type. Unfortunately, because a list_head was used a bug was introduced which made it possible for iterate_supers_type() to get stuck spinning on a super block which was just deactivated. This can occur because when the list head is removed from the fs_supers list it is reinitialized to point to itself. If the iterate_supers_type() function happened to be processing the removed list_head it will get stuck spinning on that list_head. The bug was fixed in the 3.3 kernel by converting the list_head to an hlist_node. However, to resolve the issue for existing 3.0 - 3.2 kernels we detect when a list_head is used. Then to prevent the spinning from occurring the .next pointer is set to the fs_supers list_head which ensures the iterate_supers_type() function will always terminate. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#1045 Closes openzfs#861 Closes openzfs#790
…enzfs#1045) --- updated-dependencies: - dependency-name: unicode-ident dependency-type: indirect update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
I got a second occurrence of the issue described at http://thread.gmane.org/gmane.linux.file-systems.zfs.user/4661
I've been doing an "offsite backup" every week, whereby I zfs-send|zfs-recv a number of datasets from one zpool onto another zpool on a pair of hard drives (well luks devices on top of hard drives). I do a zfs export, luksClose before taking the drives offsite.
Today, for some reason, the zfs export fails with:
There is no zfs command running, nothing mounted (zpool export managed to do that part) on there (checked /proc/mounts as well), nothing uses the zvols in there, no loop device or anything. I've tried to killall -STOP udevd in case it was somehow accessing stuff while the export was trying to tidy them away.
I've got a sysrq-t output, not sure what to look for to see what may be holding it.
Trying to "zfs mount -a" to see if I can mount it back, it says for every mount point:
filesystem 'offsite-backup-05/main/servers/skywalker/shadow_nbd/c' is already mounted cannot mount 'offsite-backup-05/main/servers/skywalker/shadow_nbd/c': Resource temporarily unavailable
While "grep offsite-backup-05 /proc/mounts" returns nothing.
So there's something definitely going wrong there.
I can still read the zvols on there, though.
I have the zevents going to the console (zfs_zevent_console=1) and there has been nothing (no IO error, no nothing, I used to get a lot of oops, but since upgrading the memory to 48GB, it has been quite stable until now).
Before rebooting, I also tried to export the other zpool (the one I was "zfs send"ing from) and got the same EBUSY error (succesful umount but EBUSY upon the ioctl(ZPOOL_EXPORT) as for the other one).
I noticed (in top) an arc_adapt taking 100% of 1 CPU. Running a sysrq-l a few times showed each time it being in:
In case that talks to anybody.
The text was updated successfully, but these errors were encountered: