-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure to mount: "unable to fetch ZFS version for filesystem" after receiving a snapshot #7617
Comments
I'm seeing this lately on new systems. The sender is 0.6.5 series, the receiver is days old ZFS from Git. I'm suspecting there's a receiver side problem in the last month or two? Only affects incremental sends, full sends are clean. Gonna try bisecting. |
047116a is the first bad commit
Normal output from zdb on filesystem object 1:
When using this version (or newer) no items are listed.
Not even an indication it's supposed to be a ZAP. Sender is running version
Mentioning @tcaputi as the author of the bisected commit. |
@DeHackEd Could you also turn on error printing with:
Then cause the problem again and provide the output in |
Not right now, because I can't spare the machine for experimentation with different versions of ZFS while people are using it. |
If you can cause the issue on the current version that should be fine. Otherwise we can wait until you have time. |
Oh yes the current version does it. Unfortunately the (saved) send streams I've been using weigh in at 300 GB compressed for the initial snapshot and 3 GB compressed for the incremental. I don't have that kind of virtual machine laying around right now... Maybe tonight... |
The problem is only on the incremental. You shouldn't need to receive the full again (you can just destroy the second snapshot). We can wait if you need to though |
The issue is importing my known working reproducer on a known broken version of ZFS on a machine that 1) I can run with the expectation that it could hang, need to be rebooted, or generally reinstall ZFS and 2) Has the capacity to load the test cases.
Need to expand the debug buffer size and try again with |
Debug log: http://www.dehacked.net/zfs-noversion.zip (about 31 megabytes decompressed). It was a slow job due to the amount of metadata the incremental needed to load as it went. |
I'll try to take a look tomorrow. One last question. Do you have the large_dnode feature enabled on either side ( |
No. The sender is 0.6.5 and doesn't support it. I've tested a receiver with the feature explicitly disabled, and one with it enabled but never activated. |
I think I see what's going on. I'll have a PR up for you to test by the end of the day. |
Actually. I think this should be a one liner. Try applying this diff and if it works I'll make a full PR out of it:
|
For background, the issue (if I'm correct) is that your send streams don't support large dnodes, so they send |
No luck with that diff. |
@DeHackEd I was able to reproduce the issue locally. I missed another place where this check was needed. Please try this patch:
|
And still no love. After each failed job I |
Thanks a lot for working with me on this. I don't have a good script to reproduce your new problem, but I think I see what may be causing the issue. Try this diff when you get a chance: https://pastebin.com/9NTxCirm |
This one worked. Receive was successful, the version is set, object 1 looks intact and I can mount the filesystem now. |
Wonderful. I'll make a PR by the end of the day. Thanks for the help. |
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Fixes: openzfs#7617 Signed-off-by: Tom Caputi <tcaputi@datto.com>
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #7617 Closes #7662
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662 Requires-spl: refs/pull/707/head
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662
Currently, there is a bug where older send streams without the DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly. The code in receive_object() fails to handle cases where drro->drr_dn_slots is set to 0, which is always the case when the sending code does not support this feature flag. This patch fixes the issue by ensuring that that a value of 0 is treated as DNODE_MIN_SLOTS. Tested-by: DHE <git@dehacked.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#7617 Closes openzfs#7662
System information
I'm running zfs on two systems. One for live system and another one for backup. The issue happened after receiving a snapshot from the live system. Live system is still fine. The pool was mounting fine before this, and now refuses to mount even in read-only mode:
I sent the backup over netcat inside LAN. Each command exited fine without errors. Commands for send/recv were following:
Basic info about backup pool:
I noticed the version info is missing and zdb didn't output any version info from pool:
I'm not sure if it's related, but I had to tune some options (changing sync, compression, xattr options and changing ARC size) before I got the transmission of snapshot correctly working. The write speeds were painfully slow at first with high I/O load, and the send/recv was interrupted a few times before I got it working... I also removed some old snapshots. Here's the history:
I also tried the rescuecd ISO with ZFS preinstalled, but everything was the same. I could just create a new backup pool and send everything back over, but nuking the old pool feels a bit scary to do.
The text was updated successfully, but these errors were encountered: