-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zpool destroy fails on zpool with suspended IO #2878
Comments
Maybe reattach the original disk and try zpool clear or zpool online ? |
I agree with you but I just want to remove a old disk and insert new disk in the same slot and recreate a zpool without reboot. |
Then try -f flags on zpool export or destroy. |
I removed the disk and inserted new disk tried your command but they show below messages, [root@fractal-C92E ~]# zpool status <=== removed drive from the slot and added new drive in the same slot
errors: No known data errors ==> Write some data to the zpool ie zp2 [root@fractal-C92E ~]# zpool clear zp2 [root@fractal-C92E ~]# zpool status
errors: 2 data errors, use '-v' for a list [root@fractal-C92E ~]# zpool export -f zp2 [root@fractal-C92E ~]# zpool destroy zp2 Now only option is to reboot to recreate the zpool. Let me know to try any further commands. Thanks. |
[root@fractal-C92E ~]# zpool replace -f zp2 /dev/sdb /dev/disk/by-id/scsi-SATA_ST3750640NS_3QD1GN87 |
Put the old (original) disk in the system so it is attached as the device which ZFS thinks is the member of the pool (before your last replace that was /dev/sdb), then Simple solution: reboot. One thing to keep in mind: replacing a non-redundant vdev of a pool needs to be done while the data on it is still available - else the data can't be copied, like in your example where you pulled the drive. |
Our use case is we don't use ZFS raid but a single drive per pool, since replication factor is taken care at gluster volume. How about an option to support forced destroy (to destroy when pool I/O is currently suspended ) so that this corner case can be addressed ? |
We are looking for graceful replacement of drive without reboot. Is there any workaround ? |
@kiranfractal If you plan to create redundancy only through the use of replication in gluster then you should be able to reboot the box without any problems at any point in time. For replacing see man zpool for useage and limitations. Note that your use case (live pool with non-redundant top level vdev plus the disk backing that vdev being offline) is afaik currently not supported apart from reconnecting the original disk (including the data on it) to continue. Destroying the redundancy of a pool below one good copy is a goto to restore from backup, per definition this is offline time. At the moment i can only suggest for production to either change your setup (create pools with redundancy, like zfs is intended to be used if you care about your data being healthy and online on that specific machine) or use a simple throwaway posix filesystem to be bricked as storage backend for gluster (which might nevertheless block in case you remove the disk backing the filesystem prior to unmounting). Only question left: Are there plans for ZoL to support this? (edit: accidental premature send) |
There's definitely some work to be done here, and coincidentally enough just yesterday I opened a pull request, #2890, with a few bug fixes along these lines. Most of the infrastructure to force export/destroy a pool is already in place but there are still some gotchas which need to be sorted out. For example, one major restriction imposed by the Linux VFS we'll have to contend with is that a filesystem cannot be unmounted if it has open file handles. And any mounted filesystem will hold references on the pool which will prevent us from destroying it. So if the administration kills off all the processes with open file handles then the filesystem should be unmountable even when the pool is suspended. Additionally, if the What probably needs to happen is for someone to take a moment and work through all of these cases is wee what works currently and what doesn't. |
I tried with -F option to export and it is hanging as below.
pool: mnt
dmesg output: SPLError: 1176:0:(spl-err.c:67:vcmn_err()) WARNING: Pool 'mnt' has encountered an uncorrectable I/O failure and has been suspended. sysdig_probe: driver loading |
Thanks for the additional test case. In this case it's blocking trying to update the history which clearly won't work. This is a slight variation on the other issue. Let me ask an opened ended question. For a pool which is suspended what would you expect the following behavior to be for the following commands. Where -f mean force and -F means hardforce.
|
Let me start with a disclaimer that I have not started to look at the inside workings of ZFS. However, we rely heavily on ZFS to provide ondisk data consistency without interruptions to data access (caused by say a reboot). I'm coming at this from the situation of recovering gracefully from an unrecoverable failure in the underlying drives. This could be either in the situation when we have a non-redundant pool (say a pool per disk) or a redundant pool where the failure tolerances have been exceeded (say a raidz1 with a 2 disk failure). In these situations, I'd ideally like to recover without the need for a system reboot which will cause other services to also be affected. I'm not so worried about the loss of data caused by the failed drive because I can handle that at a higher layer. That said, here are some tentative answers for the questions given above, specifically for a destroy :
When is the tentative date for the zfs-0.6.4 release ? |
openzfs#2890 openzfs#2878 Hi, I am trying disk replacement from a single disk zpool and it results to suspended IO and does not allow me to destroy the zpool. Steps to reproduce: create a zpool with single disk (zpool create zp1 /dev/sda) remove the disk zpool status shows disk unavailable insert a new disk zpool replace zp1 /dev/sda /dev/sdb it says "cannot replace /dev/sda with /dev/sdb : pool I/O is currently suspended" The above steps holds good for zpool destroy as well. Now the zpool is unusable and only option for me to destroy zpool is to reboot the system and destroy the old pool and create a new pool. Is there any other option can I try so that I can replace the disk without rebooting the system ? Thanks, Kiran.
Cleanly destroying or exporting a pool requires that the pool not be suspended. Therefore, set the POOL_CHECK_SUSPENDED flag for these ioctls so the utilities will output a descriptive error message rather than block. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#2878
Cleanly destroying or exporting a pool requires that the pool not be suspended. Therefore, set the POOL_CHECK_SUSPENDED flag for these ioctls so the utilities will output a descriptive error message rather than block. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2878
I've merged 87a63dd to master which ensures a reasonable error is generated when attempting to destroy or export a suspended pool. This is far preferable to a hang. However, I'd still like to leave this issue open so we can explore what is reasonable behavior for a force option. 87a63dd Prevent "zpool destroy|export" when suspended |
We are trying to use ZFS for the exact same scenario. Our clients have a pool of spinning rust they use for primary data storage. We zfs send that data to two places--an off-site backup, and a local single external USB drive. Basically they want fast (local) recovery options in case somehow the primary data storage dies from the local USB drive. And they want an off-site backup in case the office burns down. We have external USB drives (Seagate) drives at 24 locations and it seems like one dies every month. When they die, we ship a new drive out, get it plugged in, and then have to reboot the box because we can't destroy 'backup-pool' because IO is suspended. Once the box has been rebooted 'backup-pool' no longer shows up, then we can create a new 'backup-pool' from the newly installed drive. I agree with @kiranfractal that there should be a '-F' option (in addition to '-f') that would force removal of the pool regardless of IO being suspended. |
referencing https://www.illumos.org/issues/4128 disks in zpools never go away when pulled |
This same issue i am facing in such a scenario:
----------> output of commands issued:
Surprisingly, issuing a scrub command: correctly tries to scan the Usb HDD, as i can see its LED blinking. Also, if in this booted session of ubunt, if i fix the path back to /dev/sdc1 /dev/sdc2 for the zpools, by removing the flash drive , and reattaching the zpool drive, the above does not change.. the issue still remains. This only gets fixed if i reboot the system. Edit : In a different fresh rebooted session, with /dev/sdc1,2
But :
thereafter after doing an export,
this time dmesg2.log There is some bug.. let me know if i can be of some other help... Hope this gets sorted... |
@ashjas I'm not seeing anything wrong or unexpected there, and certainly nothing having to do with device names. The only thing that looks like a problem is
which typically indicates that that particular dataset is busy. Often that can be resolved with -f or closing whatever files are opened. lsof is your friend. |
@ilovezfs This is not an issue? I'm unable to mount the volume. Secondly -f doesn't make any difference at all! |
@ashjas: I don't create pools like this: I don't know about how ZFS handles that, but /dev/sda, /dev/sdb, etc... can change when devices are changed. I usually create the pools by using devices in /dev/disk/by-id/* Those ID numbers are not supposed to change, even if devices are switched around. |
@behlendorf Please reconsider stuck zpool/zfs command / inability to remove a pool from the system not being a bug. This scenario isn't that uncommon with USB-disk based pools used for backups, 'reboot to fix it' should keep to be a speciality of windows... |
Agreed. USB pools for backups are not reliably usable given this issue. |
@GregorKopka I agree. There's lots of room for improvement when someone has the time to focus on this. I've just removed the Bug tag from all issues because it wasn't actually helpful. |
@behlendorf is there any hope for someone to implement this anytime soon? |
No developers I know of are currently working on this. |
A pool may only be resumed when the txg in the "best" uberblock found on-disk matches the in-core last synced txg. This is done to verify that the pool was not modified by another node or process while it was suspended. If this were to happen the result would be a corrupted pool. Since a suspended pool may no longer always be resumable it was necessary to extend the 'zpool export -F` command to allow a suspended pool to be exported. This was accomplished by leveraging the existing spa freeze functionality. During export if '-F' is given and the pool is suspended the pool will be frozen at the last synced txg and all in-core dirty data will be discarded. This allows for the pool to be safely exported without having to reboot the system. In order to test this functionality the broken 'ztest -E' option, which allows for ztest to use an existing pool, was fixed. The code needed for this was copied over from zdb. ztest is used to modify the test pool from user space while the kernel has the pool imported and suspended. This commit partially addresses issues openzfs#4003, openzfs#2023, openzfs#2878, openzfs#3256 by allowing a suspended pool to be exported, 'zpool export -F'. There may still be cases where a reference on the pool, such as a filesystem which cannot be unmounted, will prevent the pool from being exported. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
A pool may only be resumed when the txg in the "best" uberblock found on-disk matches the in-core last synced txg. This is done to verify that the pool was not modified by another node or process while it was suspended. If this were to happen the result would be a corrupted pool. Since a suspended pool may no longer always be resumable it was necessary to extend the 'zpool export -F` command to allow a suspended pool to be exported. This was accomplished by leveraging the existing spa freeze functionality. During export if '-F' is given and the pool is suspended the pool will be frozen at the last synced txg and all in-core dirty data will be discarded. This allows for the pool to be safely exported without having to reboot the system. In order to test this functionality the broken 'ztest -E' option, which allows for ztest to use an existing pool, was fixed. The code needed for this was copied over from zdb. ztest is used to modify the test pool from user space while the kernel has the pool imported and suspended. This commit partially addresses issues openzfs#4003, openzfs#2023, openzfs#2878, openzfs#3256 by allowing a suspended pool to be exported, 'zpool export -F'. There may still be cases where a reference on the pool, such as a filesystem which cannot be unmounted, will prevent the pool from being exported. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
A pool may only be resumed when the txg in the "best" uberblock found on-disk matches the in-core last synced txg. This is done to verify that the pool was not modified by another node or process while it was suspended. If this were to happen the result would be a corrupted pool. Since a suspended pool may no longer always be resumable it was necessary to extend the 'zpool export -F` command to allow a suspended pool to be exported. This was accomplished by leveraging the existing spa freeze functionality. During export if '-F' is given and the pool is suspended the pool will be frozen at the last synced txg and all in-core dirty data will be discarded. This allows for the pool to be safely exported without having to reboot the system. In order to test this functionality the broken 'ztest -E' option, which allows for ztest to use an existing pool, was fixed. The code needed for this was copied over from zdb. ztest is used to modify the test pool from user space while the kernel has the pool imported and suspended. This commit partially addresses issues openzfs#4003, openzfs#2023, openzfs#2878, openzfs#3256 by allowing a suspended pool to be exported, 'zpool export -F'. There may still be cases where a reference on the pool, such as a filesystem which cannot be unmounted, will prevent the pool from being exported. Add a basic zpool_tryimport function which can be used by zhack, zdb, and ztest to provide minimum pool import functionality. This way each utility isn't doing a slightly different thing. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
A pool may only be resumed when the txg in the "best" uberblock found on-disk matches the in-core last synced txg. This is done to verify that the pool was not modified by another node or process while it was suspended. If this were to happen the result would be a corrupted pool. Since a suspended pool may no longer always be resumable it was necessary to extend the 'zpool export -F` command to allow a suspended pool to be exported. This was accomplished by leveraging the existing spa freeze functionality. During export if '-F' is given and the pool is suspended the pool will be frozen at the last synced txg and all in-core dirty data will be discarded. This allows for the pool to be safely exported without having to reboot the system. In order to test this functionality the broken 'ztest -E' option, which allows for ztest to use an existing pool, was fixed. The code needed for this was copied over from zdb. ztest is used to modify the test pool from user space while the kernel has the pool imported and suspended. This commit partially addresses issues openzfs#4003, openzfs#2023, openzfs#2878, openzfs#3256 by allowing a suspended pool to be exported, 'zpool export -F'. There may still be cases where a reference on the pool, such as a filesystem which cannot be unmounted, will prevent the pool from being exported. Add a basic zpool_tryimport function which can be used by zhack, zdb, and ztest to provide minimum pool import functionality. This way each utility isn't doing a slightly different thing. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
A pool may only be resumed when the txg in the "best" uberblock found on-disk matches the in-core last synced txg. This is done to verify that the pool was not modified by another node or process while it was suspended. If this were to happen the result would be a corrupted pool. Since a suspended pool may no longer always be resumable it was necessary to extend the 'zpool export -F` command to allow a suspended pool to be exported. This was accomplished by leveraging the existing spa freeze functionality. During export if '-F' is given and the pool is suspended the pool will be frozen at the last synced txg and all in-core dirty data will be discarded. This allows for the pool to be safely exported without having to reboot the system. In order to test this functionality the broken 'ztest -E' option, which allows for ztest to use an existing pool, was fixed. The code needed for this was copied over from zdb. ztest is used to modify the test pool from user space while the kernel has the pool imported and suspended. This commit partially addresses issues openzfs#4003, openzfs#2023, openzfs#2878, openzfs#3256 by allowing a suspended pool to be exported, 'zpool export -F'. There may still be cases where a reference on the pool, such as a filesystem which cannot be unmounted, will prevent the pool from being exported. Add a basic zpool_tryimport function which can be used by zhack, zdb, and ztest to provide minimum pool import functionality. This way each utility isn't doing a slightly different thing. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
A pool may only be resumed when the txg in the "best" uberblock found on-disk matches the in-core last synced txg. This is done to verify that the pool was not modified by another node or process while it was suspended. If this were to happen the result would be a corrupted pool. Since a suspended pool may no longer always be resumable it was necessary to extend the 'zpool export -F` command to allow a suspended pool to be exported. This was accomplished by leveraging the existing spa freeze functionality. During export if '-F' is given and the pool is suspended the pool will be frozen at the last synced txg and all in-core dirty data will be discarded. This allows for the pool to be safely exported without having to reboot the system. In order to test this functionality the broken 'ztest -E' option, which allows for ztest to use an existing pool, was fixed. The code needed for this was copied over from zdb. ztest is used to modify the test pool from user space while the kernel has the pool imported and suspended. This commit partially addresses issues openzfs#4003, openzfs#2023, openzfs#2878, openzfs#3256 by allowing a suspended pool to be exported, 'zpool export -F'. There may still be cases where a reference on the pool, such as a filesystem which cannot be unmounted, will prevent the pool from being exported. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
I just ran into the same problem after accidentially removing the only disk the pool consists of via a
Just for others that might run into the same problem:
Now, I'm at least able to export the pool correctly. |
@lnxbil thanks, at least we have something,i have the sameproblem, using LUKS+ZFS and when something goes wrong i have to restart the server, the problem comes when i'm not at home and i cant reboot. The pitty is when i use cryptsetup luksClose RAID1 it says the device is busy.... :( |
@behlendorf this issue was closed without a who and when - is that supposed to happen? There (AFAIK) still isn't a solution to this (only workarounds for a subset of the problem), so the defect remains. Also quite a lot of information here that would certainly be interesting to someone who decides to work on this at some point in the future. Please repoen. |
It's not clear to me exactly how this was closed. I don't have any objection to reopening it since this is still and issue. |
this is still an issue in 2020 i had some flaky disks in a server and wanted to remove those, i'm somewhat sure i did zpool export or destroy before removing those 3 disks, but i must have done wrong, at least the pool was online and mounted while the disks being ripped off (didn't double check before) - now the pool it's in suspended state and apparently , it's not possible to destroy/remove from the system when in this state. as there are VMs online on this system, i'm rather curious what to do without shutting them all down and reboot.... i'm getting this in dmesg after another "zpool destroy..." hung for a while.... [1510610.643638] INFO: task txg_sync:1714 blocked for more than 120 seconds. |
Hi,
I am trying disk replacement from a single disk zpool and it results to suspended IO and does not allow me to destroy the zpool.
Steps to reproduce:
The above steps holds good for zpool destroy as well.
Now the zpool is unusable and only option for me to destroy zpool is to reboot the system and destroy the old pool and create a new pool.
Is there any other option can I try so that I can replace the disk without rebooting the system ?
Thanks,
Kiran.
The text was updated successfully, but these errors were encountered: