-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scrub stopping in the middle of run #4307
Comments
@Rovanion I've seen these traces several times and each time these (or similar) could have been related to a stalling or failing harddrive, heavy workload or high (memory) pressure so I'd assume it's the drive causing this & the kernel & ZFS not giving up on it plus the load referencing: |
I went ahead and rebooted the machine upon which the scrub continued as if nothing had ever happened. For some reason unattended upgrades hasn't been doing its job on this machine so I'm upgrading ZFS now as I'm typing this. Hopefully the issue won't rear its ugly head again. This issue can be closed as far as I'm concerned unless a ZOL-developer has any interest in investigating it further. |
@Rovanion |
Closing since this should be fixed by openzfs/spl@e843553 which is included in the latest stable release. |
My latest scrub has been stuck in the same place for two days with no progress. It could be due to a drive breaking, but neither ZFS nor Linux has given up on the drive. The machine creates snapshots for each of its five filesystems every 15 minutes and sends these away to another machine, there isn't usually any load to speak of on the pool.
I'm running ZFS 0.6.5.2-1~trusty on 64-bit Ubuntu 14.04.3 with kernel 3.13.0-76-generic. My storage layout is the following:
As of now there are a bunch of scheduled zfs operations, such as taking and destroying snapshots, progressively filling up the memory of the system.
Here are my arcstats and dmu_tx: http://paste.ubuntu.com/14877618/ http://paste.ubuntu.com/14877647/
Here is my dmesg containing ata errors and zfs stack traces from hung threads: http://paste.ubuntu.com/14877460/
Running
iostat -dmx 1
doesn't show any drive getting a ton of io, it's mostly zeroes everywhere with the occasional small traffic to some devices. The 2T pool has 600GB of storage unallocated and the machine has 8GB of RAM which is half populated as I write this.Is there any further information I can provide to aid in figuring out why this would happen. Is it at all related to issue #3947 and #3867? If the device was actually dead I'd assume that ZFS would mark it as such, but the machine seems to be stuck in limbo.
The text was updated successfully, but these errors were encountered: