-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS suspends my pool On I/O disk failure.. #7118
Comments
Please don't open a new issue for something you already reported in #7097, close this one and ask around on IRC or zfs-discuss or just wait for someone to respond with more insight into the problem on the bug. |
@rincebrain sorry about that but this is not same issue. I just post it before editing wrongly.. |
@morphinz Ah, yes, I see, I was confused why the subject didn't match the contents. This also sounds like it's not a bug, but something to ask about on IRC or zfs-discuss, unless you have good reason to think it's not related to disk IO errors. I suppose the logs you share will answer that. |
@morphinz yes can you please cherry pick 51d1b58, or wait for 0.7.6, to determine if this is |
@behlendorf I have a pool of 226 disks in a two node clustered architecture. Whenever one of them pretends to fail my pool is suspended and all the dependent services fail as well. As far as I understand by setting |
Yes, the risks involved in setting If that's a risk your OK with then you can absolutely use this as a workaround until @ofaaland root causes what's happening. But if I understand correctly you're saying that a single disk failure can cause this, that shouldn't be the case. The MMP code is heart beating all of those drives and only a single one needs to succeed. Do you know if when a drive fails the others are unavailable if your configuration for some reason?
Indeed, I've added it to the queue for 0.7.7. Sorry about that. |
This risk seems OK for me thanks for helping. I have no clue that whole disks are disappeared. My pool reports as healthy and there are no read/write checksum errors, but never run a scrub after this issue. defaults { blacklist { My pool is created with devices under /dev/mapper. However I am sure enough that single disappeared and appeared with in a few seconds. This system is up & running more than 6 months and pool suspend has never occurred before. I'd like to help for further investigation. This is a production site but I can help with my test systems if it is possible to simulate the problem again. |
I have experienced similar issue as @morphinz . My zpool structure was:
Multihost was enabled:
After remove one of the disk and run high I/O I got suspend I/O:
dmesg
so I used 51d1b58 to be sure that this is multihost related, repeated the test and got:
Maybe setting zfs_multihost_fail_intervals to 30 seconds as SCSI timeout for disk will be some solution? |
Issue can be also repeated without high I/O with zfs_multihost_fail=1 and with removing disk scenario. Seems that when one disk in vdev is mark as REMOVED, FAULTED or UNAVAIL multihost causes suspend I/O. I have test enviroment with 100% scenario and also can help for investigation. Logs without high I/O:
|
@arturpzol it would be very helpful if you could rerun your test case with the multihost logging enabled.
|
@behlendorf logs are below:
One of below output when suspended occurred:
|
I have tried to test 891b2e7, abd17be and seems that suspending with my test is very hard repeatable but still easy possible with Is there any chance to merge commits to the master in a short time? |
@arturpzol I'm working on patches to make the multihost history more useful, and using those to investigate the issue with a removed disk triggering suspend via MMP. I'll get those pushed where you can get them, it will help clarify what's going on. I see your test involves removing a physical disk. Are you able to reproduce just by offlining or detaching a disk via zpool offline or zpool detach (do this on you test environment, not your production system)? It seems from what you've described like there's a bug in the way MMP handles vdev changes that needs to be fixed, regardless of any changes to the suspend timeouts/import delay. Also, there are concerns with those two patches; not that that they don't work, but that the import time is unbounded. So it's not up to me, but I do not believe those patches are ready to be merged to master. I'm sorry this isn't going faster, doing the best we can. |
The mmp write marked with asterisks is after the pool was suspended; the timestamp 1518211349 matches the time of the zfs.io_failure event.
At that point mmp_delay was over 5 seconds, when it had been 141ms prior to that. And the most recent mmp write was 5 seconds before that. So something caused the mmp thread to stop attempting to write for 5 seconds (multihost history shows each attempt whether it was successful or not). The other ZFS events all have the same time stamp, including the first one that says vdev_state = "REMOVED" or similar. So something happened 5 seconds before that which caused the problem; perhaps 5 seconds before is when you actually caused the device removal. @arturpzol do you have a way to know if that is the case? I'm guessing the process of detecting that the device was now gone and trying to gather data on it somehow holds up the mmp thread. I suspect that code has taken the config lock as writer and held it for 5 seconds. I'll look there next. |
@ofaaland thank you for your notes. I have also leaded to pool suspend with zpool offline even with zpool online but that is depend of I/O load.
Unfortunately I do not have suspend logs from my previous note so below logs from zpool offline operations:
Below kernel logs are with some changes in code for facility debug:
|
@ofaaland should I repeat the test with zfs 0.7.7 ? Is there any chance to fix unexpected suspend in case of vdev state change or only one solution for now is |
@arturpzol , Sorry, I do not expect the patches in 0.7.7 fixed your issue, unfortunately. I'm still working on it, but do not have a working fix yet. |
Closing. I believe this was resolved in 0.7.7 by c30e716. |
@behlendorf I will test again when I got free time. |
System information
Describe the problem you're observing
I have a pool with multihost=on property.
2 time on 3 day zfs just suspended my pool. I cant see any log about bad thing. Just suspends with not saying anything. This is the only log about it:
[Tue Jan 30 09:16:52 2018] WARNING: Pool 'clspool' has encountered an uncorrectable I/O failure and has been suspended.
On first failure i saw few I/O disk failure but i didnt saw any fault on disks with "zpool status".
After that i rebooted the server and i got my pool back, everything was clear, i didnt see any bad things on dmesg and i moved on.
But after 3 days, in this morning zfs just suspended my pool again and i saw Disk I/O failure on same disk and zpool status gaved to me 71 fault on the disk.. But my every vdev has 10 disk from 5 different jbod and i have "raidz_2". Just one disk failure can't break my 226 disk pool. Also i have 5 spare too. Even i lost 1 jbod on my system i didn't have any problem... But 1 disk failure ruins it? What the heck is that?
I'm afraid zfs cant handle 1 disk failure right now. And i think may be the problem is "multihost=on".
Because on different pool,kernel,unix etc. when i got a disk error i faced with so bad things; like kernel freeze or etc. But first time i saw suspend issue and my difference is multihost property.
If you need anything just ask me..
Include any warning/errors/backtraces from the system logs
Edit: "Adding logs"
My pool: https://paste.ubuntu.com/26511690/
And this is the suspend log:
Before the suspend i see these logs first. As you can see I have an I/O failure disk.
Dmesg:
The text was updated successfully, but these errors were encountered: