-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel panic on v0.6.3-159_gc944be5 #2945
Comments
This was very likely fixed by #2884. If possible could you try either.
|
Okay, I will try (1) and report you about result. |
Is it normal - I waited 1 hour but space in parent pool freed less than 100GB? And one more strange fact. Is it normal that scrub speed in my config approx. 10-12MB/sec on raidz2? |
I have tried (1). Using latest packages from zfs-testing repo (see versions in issue description) WITH Kernel panic message is too long for paste here. Please see it in gist: https://gist.github.com/umask/62b247f37107053ff791 This kernel panic occurred at first rsync when 6-7TB was transferred. |
I have found messages in my logs
May be it's something connected with issue described in this ticket. See also #2752 |
#2752 is resolved for me now. |
For clean experiment I erased disks, controller settings and created zpool from scratch. Now initial rsync is running again. Hope that no kernel panic will occur. |
Panic again while initial rsync: |
While copying first 6-7TB of data all was OK.
Probably this messages something connected with kernel panic which occurs after. There is messages about kernel panic (too big to paste here): https://gist.github.com/umask/04651d14729c16331c6e (problem is the same as here https://gist.github.com/umask/62b247f37107053ff791) |
@behlendorf , could you give me some advice? May be I need do not to use I need to store backups on this server and if I do not make ZoL workable I will have to use ext3/4 with rdiff-backup (it's very slow...) :( |
I'm trying to locate file/directory which rsync copying when kernel panic occurs... |
Panic occurs on different files (random files in one directory?). |
I have erased disks, controller settings and created zpool from scratch... again. Now initial rsync is running.
(zfs packages still from testing repo) |
@umask Here's a few notes. The original panic posted in this issue which had the following stack trace:
is most certainly caused by a corrupted dnode in which arbitrary data are used as a blkptr for a spill block. Second, this type of corruption can definitely lead to filesystems which can't be destroyed via Finally, the 4254acb patch should fix all cases of corrupted dnodes I'm aware of but it will not do anything for a filesystem which has already been corrupted. In fact, no fix commited to master (other than maybe 5f6d0b6) will be of much assistance to an already-corrupted filesystem. If you are able to create a filesystem with a corrupted dnode using a module containing 4254acb, I'd sure like to know about it. And, if so, please try to track down one of the corrupted files and/or directories and post the output of |
@dweeezil how I can determine corrupted file(s) and/or directory(ies)? (I suspect one directory with 35GB of many subdirectories and files and I hope that problem occurs on it; otherwise I have to wait until ~7TB of data will by rsynced). |
If I gets kernel crash dump using KDump - will it be enough to identify problem? |
@umask The easiest way to identify the file with a bad dnode is to try to run a stat(2) on each file (typically by running a simple script). You can save time by trying directories first since they're most likely to become corrupted. Once you find a corrupted file and/or directory, start by running I'd like to clarify how you're running into this problem: Are you creating a brand new ZFS filesystem with a module as of at least 4254acb (including all previous SA-related fixes) and then populating it with rsync and getting the corruption? Or is this an existing filesystem which was populated prior to all the SA fixes? |
There is details how my zpool created and how I get kernel panic:
i.e. ZFS filesystem created by latest available version of ZoL with all available SA-related fixes. @dweeezil what about Currently I'm running rsync with I have setup kdump. Will its help if panic occurs? |
@umask Neither of those sets of stack traces from the gists look to be SA/spill related (but they're very long and I've not looked too thoroughly at them yet). What kind of preemption is your kernel using? Settings other than |
This kernel .config parameters are standard for vzkernel (OpenVZ kernel for rhel6/centos6) and kernel from rhel6/centos6 vendors:
Both gists I got from two latest kernel panics. So, I'm waiting for my current test results. I will write about result here, in this ticket. |
So, in result rsync completed successfully
with Before this success I have tried |
I have run rsync again on my dataset and no kernel panics occurs anymore. I convinced that problem in Unfortunately I have no possibility to reproduce problem on this server because I need consistent backups. I'm setting up new one for tests. |
@umask If there is still a lingering issue with SA xattrs and the manner in which they're typically used when |
Closing, all known SA fixes have been merged. |
I'm running server with ZoL to save backups on it.
Short details about my config:
I have run stable version of ZoL which provided by this packages for centos 6:
No any problem happened while I make first backup using rsync. But after initial backup every time when I run rsync again kernel panic occurs.
I googled for the same problem #2701 and decided update zfs related packages using zfs-testing repo.
My current installed packages state is:
After reboot with new versions of zfs modules problem happened again.
There is backtrace for kernel panic with 0.6.3 from stable zfs repo:
There is backtrace for kernel panic with zfs modules from testing repo:
As you may note kernel panic occurs after 2 hours from server start. After server start I run rsync manually.
There are details about my config:
0. My zfs partition use dedup and gzip-7 compression. 2 SATA disks used for mdraids raid1 for swap/system/boot partitions with ext4. 8 SATA disks attached to LSI MegaRAID. For each of 8 drives raid0 volume created:
zpool created on this 8 virtual raid0 drives.
(I tried to run scrub, but speed was very slow and I stopped it; after update from zfs-testing repo I didn't proceed
zpool upgrade
).Additional details:
The text was updated successfully, but these errors were encountered: