Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server died while replacing a disk, Error in Syslog #4771

Closed
maxigs opened this issue Jun 17, 2016 · 3 comments
Closed

Server died while replacing a disk, Error in Syslog #4771

maxigs opened this issue Jun 17, 2016 · 3 comments

Comments

@maxigs
Copy link

maxigs commented Jun 17, 2016

Hi,

i just had a zpool die on me while i was adding/replacing a disk.

It was going fine for a while re-silvering when i replaced a disk (after adding a wrong one, and replacing it with the right) but then the server became completely unresponsive and had to be reset eventually.

Here is the log output of that time, suggesting i file a bug report here:
https://gist.github.com/maxigs/2c34b682e519c1593e4047da6b73fe49#file-syslog

I have 2 zpools running on this system, the second one is unaffected, but my other is unrecoverable as far as i see it - i tried everything i could find.

$ sudo zpool import zpool2
=> cannot import 'zpool2': one or more devices is currently unavailable

sudo zpool import -F -n
=>

pool: zpool2
id: 17841620474451387077
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://zfsonlinux.org/msg/ZFS-8000-6X
config:

zpool2 UNAVAIL missing device
ata-ST4000VN000-1H4168_Z30126KG-part2 ONLINE
ata-ST4000VN000-1H4168_W300PHEH-part2 ONLINE
ata-ST4000VN000-1H4168_W300QEG8-part2 ONLINE
ata-ST4000VN000-1H4168_Z3012D6M-part2 ONLINE

Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.

But both, old and new, disks are available in the system and seem to be fine (no smart errors or anything). ZDB still finds info on the old disk:

$ sudo zdb -l /mnt/example.img
=> https://gist.github.com/maxigs/2c34b682e519c1593e4047da6b73fe49#file-zdb-old-disk

but nothing on the new

$ sudo zdb -l /dev/disk/by-id/usb-Seagate_Expansion_Desk_NA8ET4M9-0:0
=> https://gist.github.com/maxigs/2c34b682e519c1593e4047da6b73fe49#file-zdb-new-disk

$ sudo zpool history -li
=> https://gist.github.com/maxigs/2c34b682e519c1593e4047da6b73fe49#file-zpool-history

Is there any chance to recover the pool?
I know its no kind of raid, but i was kinda expecting the high praised ZFS to be a bit more resilient to such failure. At least to be able to recover or continue with the still available parts of the system, when a disk fails.
There was no really critical data there (that one is nicely backed in the other zpool with raid and additional external snapshotting), but it's still a ton of data to recover from backups.

@tuxoko
Copy link
Contributor

tuxoko commented Jun 17, 2016

Same as #4752

@maxigs
Copy link
Author

maxigs commented Jun 18, 2016

Thanks @tuxoko, indeed error message seems similar. But in my case it even needed a bit more memory.

I tried to set spl_kmem_alloc_max to 12M now, hoping this would be enough.
No idea if it was supposed to change anything.

I assume my zpool is still unrecoverable?

@maxigs
Copy link
Author

maxigs commented Jun 19, 2016

I'm closing it, i gave up on the old zpool, and the bug itself already is addressed.

Thanks guys :)

@maxigs maxigs closed this as completed Jun 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants