-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to receive encrypted raw send #12720
Comments
I'm actually not sure what to do here. I can't send a raw backup, full stop. And if I send it non-raw, I can't switch the incrementals to be raw because the IVs won't be the same. What am I meant to do in this situation? I will try running 2.1.0 in a VM or something on the target host to see if it makes any difference. |
In theory, that message isn't supposed to happen unless you're sending from a newer version than being received on, and somehow it didn't notice a missing feature required until later, at which point the "fix" is pretty obvious. In practice, that message comes out for all sorts of edge cases, and could probably stand to be refined, though I'd personally just suggest changing it to say something like "Either you need to upgrade your receiver to a newer version or, if they're not different (or the receiver is newer), you've hit an edge case we aren't handling, please report a bug." Would you possibly be willing to share a full (e.g. not solely the resume portion, though that might be helpful in addition to the whole) raw (e.g. encrypted data blocks, though not some metadata) send stream with someone to debug? I'm probably not an ideal candidate, not being familiar with the over the wire format of send/recv or all the places encrypted recv touches, but it'd probably be very helpful if someone could repro it locally and iterate on whatever's going awry rather than having to round trip to you for testing each time. (It's perfectly okay to say no if you're not comfortable;; it would just be useful to know ) I suspect it'll just say "lol it's encrypted go pound sand", from recollection, but what do You said the other system is on bullseye with 2.0.6, is that from backports? Could you give us the full output of Also just for completeness, could you try receiving the stream locally somewhere on the sender to see if it errors the same way? One final remark - while you're correct that you couldn't go from non-raw to raw send, you could still do non-raw send and enable encryption on the receiver like with any non-raw send being received. This of course doesn't work if you don't trust the receiver to have access to the encryption key, but that may or may not be an issue in your environment. |
Thanks for your reply, @rincebrain I'll try to address everything in order. Sharing a full send is going to be hard. It's 1.3TB, and it's a dataset I'd prefer not to directly share anyway. But I realize that it may only repro on this exact snapshot on this exact dataset. I don't mind the back-and-forth, though. Attached is the output of zdb for each of those objects as requested: The target system is on 2.0.6 via backports. Source zfs version:
Target zfs version:
I just tried sending the stream locally from source to a new dataset with the same error at the same position:
As a side note, the performance of zfs send here has been relatively poor. Before I switched to encryption I could zfs send at a rate of about 1-1.5 GiB/second. Unlike sequential scrubs, zfs send seems to be IOP bound. I'm not sure what makes the encrypted version so much slower. The pool topology on the source machine is 5x11 raidz2. Target is 3x12 raidz2. All are spinning disks between 12T-18T in size. Regarding the ability to still send a backup unencrypted, it is indeed the case that I want raw sends to work since they'll be going to an untrusted offsite system. |
Okay, I just figured it'd be worth asking (though it'd also be interesting to see if you could create a clone with only the minimal data necessary to reproduce - possibly by doing a zfs clone of the snapshot, finding just the specific files those IDs I asked you for the zdb data for map to, and nuking everything but them and their containing dirs...you could find the specific files with I don't know of any performance bottlenecks specific to only raw send - I'd be curious to check out a flamegraph of send -Lec versus send -w on the same dataset over, say, sleep 120 - it might be IO bound, but I'd like to see if there's any visibly different callstacks that turn up... RAIDZs aren't particularly made of IOPs in general - what makes you think it's IOP bound and not CPU or other bound? (If it's "it's in iowait a bunch", that could mean many things.) Yeah, I'm not surprised that unencrypted doesn't work for you, just thought I'd ask. :/ |
Good idea on the clone strategy. Worked a charm, repros now with a 73M dataset. Here are my notes as I did this, just for reference:
Here is a dropbox link of the zfs send: You're right that I sort of assumed it was IOP-bound simply because of frequent iowait on |
To add a little more context if it matters, here is this dataset's zfs send/recv "provenance". Originally it was unencrypted on a single host running normally. When I prepared the offsite, I wanted to also take this opportunity to transfer the data back-and-forth to get encryption on both hosts and also rebalance data across vdevs. The unencrypted dataset was sent to the backup machine into an encrypted dataset, then sent back into the primary pool so it would be encrypted -- I did not use a raw send here, though, and in order for raw sends to work on the offsite I needed to once again wipe the pool on the offsite and now send the dataset with the raw flag. It would have been much easier to have used a raw send, but there is a long standing bug that directly affected my workflow here. Once I sent the dataset to the offsite, I needed to briefly wipe the source pool and send everything back, but I kept the source system online by mounting the offsite over nfs. This resulted in the dataset changing and needing to be sent back, which is affected by that bug. I confirmed it by dry-running before I started this journey and didn't want to take a risk. So I decided to do an extra transfer once everything was situated on the primary host so the raw sends can start clean and never risk being modified by the offsite and being sent back. tl;dr:
The most recent zfs send to work fine was #3. The affected files are not new at all. Old stuff. |
Just to be clear, did you send just this dataset, or was it part of a hierarchy of datasets being sent around? You could always try cherrypicking/un-reverting d1d4769 and seeing if it helps your life. |
Only this dataset in this case. Not a hierarchy.
…On Thu, Nov 11, 2021 at 4:07 PM Rich Ercolani ***@***.***> wrote:
Just to be clear, did you send *just* this dataset, or was it part of a
hierarchy of datasets being sent around?
You could always try cherrypicking/un-reverting d1d4769
<d1d4769>
and seeing if it helps your life.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#12720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAASBT6QMF2ZHMRNQCWIEC3ULQ5AHANCNFSM5HH4C6DQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
I'm a little surprised -- is nobody working on bugs on raw sends? I can reliably reproduce this and provided an easy repro case. I'd definitely like to dig in myself but it's a lot to spin up. Maybe I hit some edge case as I'm surprised nobody else seems to have run into this. As it stands I'm just waiting until 2.1.1 lands in Debian (the packager finally uploaded to experimental a week ago) to try to test further. |
Not quite nobody, no. Just almost.
I've failed to convince anybody else so far to work on these, but I'm
trying to knock them out one at a time, unfortunately slowly.
I believe the root cause(s) for some of the bugs that work similarly to
this one are understood, just no fix written and remaining committed yet.
As far as how this situation arose, it seems it works well for the few big
companies that use it, and there haven't been many in the intersection of
{willing to try fixing, uses native encryption} - even I don't meet that
criteria, I don't use it normally.
I'm displeased by this, but I'm already doing all I can about it.
…On Wed, Dec 8, 2021, 4:38 PM Chris Putnam ***@***.***> wrote:
I'm a little surprised -- is nobody working on bugs on raw sends? I can
reliably reproduce this and provided an easy repro case. I'd definitely
like to dig in myself but it's a lot to spin up. Maybe I hit some edge case
as I'm surprised nobody else seems to have run into this. As it stands I'm
just waiting until 2.1.1 lands in Debian (the packager finally uploaded to
experimental a week ago) to try to test further.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#12720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUI7NITOV47BODKQ6VEGLUP7F5VANCNFSM5HH4C6DQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
for what it's worth, this feature has been broken since it landed and it's received no love from the dev team. Datto contributed it but doesn't wish to contribute to its maintenance. this is one reason I'm skeptical of all the new features (dRAID, Object storage) - they're running the BTRFS playbook, adding feature after feature without validating the existing functionality. all of that, on top of major reorganisation of code as platforms consolidate on OpenZFS. i honestly don't blame anyone who doesn't want to upgrade from 0.8.x |
I have updated to 2.1.1 after it landed in Debian unstable. The send stream generated by 2.1.1 is identical to the send stream generated by 2.0.6. The operation fails in the same way:
|
@putnam do you have a reproducer for this one? |
Yes, please see this comment above: You can grab the ~70MB repro via the Dropbox link at the bottom. Try receiving that into a new dataset. |
It's the above part of
Now there are a bunch of possibilities here, let's see which one is failing. |
is failing, leading to the error. Let me see if I can figure out what is going on. The
These parts of |
Inspecting the raw dataset from @putnam with
Which shows the problematic one: The
and since we know that the flag
|
Right now, I cannot see how @putnam If you would absolutely like to try to retrieve the data using raw receiving you could try the following patch (with every precaution, I was able to raw receive that dataset in a new pool):
|
So, we have a dnode:
with 1 slot, meaning that it is 512 bytes, leaving a space of 448 bytes (512-64). It also has 1 blkptr = 128 bytes. and, because it has the SPILL_BLKPTR set, a spill pointer = 128 bytes. Which means that the bonuslen should be 192 and not 288 as stored in the zfs send stream. A bonuslen of 288 means: 288 bonus + 64 + 128 (1 blkptr) = 480 bytes, leaving only 512-480=32 bytes for a spill pointer whose size is 128 bytes, making this impossible. I think this means that probably something was wrong with setting/clearing of |
Thank you for digging into this!
The param is set to "1" for me. AFAIK this is the default(?)
…On Sat, Jan 22, 2022 at 3:04 PM George Amanakis ***@***.***> wrote:
I just came across commit caf9dd2
<caf9dd2>
which may be highly relevant. @putnam <https://github.com/putnam> what is
your zfs parameter "zfs_send_unmodified_spill_blocks" set to?
—
Reply to this email directly, view it on GitHub
<#12720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAASBTYIOEFLQAUJ6YZXTELUXMLVZANCNFSM5HH4C6DQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yes, that is the default but doesn't affect raw send... Do you have |
The current ZTS has a test: I will probably submit a PR using the modified test using raw sends just in case something breaks in the future. However, the problem in the current issue is that somehow |
To answer your questions on the dataset, here is
As for the file itself, uhh, not sure how I didn't see this sooner but I get an unusual result from ls:
Bad address? Then trying to get its extended attributes:
OK, so clearly the xattr is the issue:
The file itself is not corrupt. It's a video of a porcupine on my fence. It plays fine off a network share. I can also see its contents with xxd. Stat is OK:
As you can see it hasn't been touched in a long time. It is plausible that an extended attribute got set via Samba and Finder though. |
There is a second file in the same folder with the same error:
So object 457084 may also fail during the send, if it had made it that far. |
I'm not sure here but could this be related to #2700 somehow? The parent directory does have the group sticky bit set (g+s). I use this all over the place without any issue typically. Also, note that I was able to zfs send this dataset multiple times -- once from the original unencrypted pool to my offsite, which was encrypted, then again back to the original, also encrypted (but not a raw send). |
Thanks for the input! That file has corrupt xattrs which are stored in spill blocks when zfs property "dnodesize = legacy" (default) is set (I think the manual recommends setting this to auto but it's not the default right now). The corruption here manifestates as I am not totally surprised this was not reported in a non-raw send: looking through the code I didn't spot any assertions/checks validating the relationship between Regarding #2700, it was resolved by 4254acb. Initially I thought the test case described in 4254acb was the cause of this. |
@behlendorf would you mind having a look at this? This issue may manifestate as silent corruption of xattrs when |
The files are at least as old as their birth/mtime, and I've been using ZFS for longer. With 0.6.4 releasing in August 2015 it would make sense these files hit the pool earlier than that. I have generally always used zfs send, not rsync, for migrations. Looking at my eternal bash history I see these folders existing and being manipulated in 2014 on ZFS. So yeah the ZFS "provenance" would go back earlier than 0.6.4. I kicked off a regular send out of curiosity and it did not hang up on those files, as you suggested just now. Odd that this check would only be in the raw flow since it would not only affect raw sends, right? Makes me wonder whether that code should be duplicated the way it is rather than use a shared check function. I'm not well-versed on the internal architecture so I don't want to make dumb assumptions though. Scrub does not report any issues. I scrub monthly and act on any issues; zpool status is OK as well. |
Since the xattr's are really not important on these files, and it sounds like this was a bug fixed way back in 0.6.4, I think simply copying the files to rewrite them may be the solution here. I used this to try to track down any straggler files and it really is just these two:
No telling what the event was that may have triggered the sa corruption, but it probably was something Finder or Samba did. Adjacent files had only a DOSATTRIB xattr that would have been written by Samba at some point or another, but it's decently long... So, I will go ahead and rewrite these files to wipe the xattr corruption away, unless you want me to keep this repro alive for a bit longer. It does sound useful to add those checks to scrub/non-raw send though! As a side note: what is the penalty for not setting dnodesize=auto? If it is left on legacy, is there some limitation? I don't anticipate needing to import this pool on something that can't handle auto, but just want to better understand the utility. The man page hints that the benefit is performance-related but doesn't go into detail. |
Enabling large dnodes essentially depreciates the need for spill blocks. That means that metadata can be read in fewer requests as opposed to filing new read requests for spill blocks. I was thinking about adding an assert in |
One issue: wouldn't these objects taint snapshots, too? So I will lose my snapshot history. Is there some way to extricate the corrupt files from snapshots? Or any suggestion you may have that might preserve snapshots while fixing the issue? Edit: One thought -- replace the files, snapshot it, clone that to a new fs, and begin using that one going forward. Keep the old fs readonly until the snaps are no longer needed. I think this solution will avoid the duplication issue of making a new fs and rsync'ing into it. There is a possible advantage to migrating via rsync in that setting dnodesize=auto will apply to all the new writes and may result in a performance improvement. |
Great work @gamanakis ! Thank you for checking into this so quickly and thoroughly! |
In files created/modified before 4254acb there may be a corruption of xattrs which is not reported during scrub and normal send/receive. It manifests only as an error when raw sending/receiving. This happens because currently only the raw receive path checks for discrepancies between the dnode bonus length and the spill pointer flag. In case we encounter a dnode whose bonus length is greater than the predicted one, we should report an error. Modify in this regard dnode_sync() with an assertion at the end, dump_dnode() to error out, dsl_scan_recurse() to report errors during a scrub, and zstream to report a warning when dumping. Also added a test to verify spill blocks are sent correctly in a raw send. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #12720 Closes #13014
In files created/modified before 4254acb there may be a corruption of xattrs which is not reported during scrub and normal send/receive. It manifests only as an error when raw sending/receiving. This happens because currently only the raw receive path checks for discrepancies between the dnode bonus length and the spill pointer flag. In case we encounter a dnode whose bonus length is greater than the predicted one, we should report an error. Modify in this regard dnode_sync() with an assertion at the end, dump_dnode() to error out, dsl_scan_recurse() to report errors during a scrub, and zstream to report a warning when dumping. Also added a test to verify spill blocks are sent correctly in a raw send. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes openzfs#12720 Closes openzfs#13014
In files created/modified before 4254acb there may be a corruption of xattrs which is not reported during scrub and normal send/receive. It manifests only as an error when raw sending/receiving. This happens because currently only the raw receive path checks for discrepancies between the dnode bonus length and the spill pointer flag. In case we encounter a dnode whose bonus length is greater than the predicted one, we should report an error. Modify in this regard dnode_sync() with an assertion at the end, dump_dnode() to error out, dsl_scan_recurse() to report errors during a scrub, and zstream to report a warning when dumping. Also added a test to verify spill blocks are sent correctly in a raw send. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes openzfs#12720 Closes openzfs#13014
In files created/modified before 4254acb there may be a corruption of xattrs which is not reported during scrub and normal send/receive. It manifests only as an error when raw sending/receiving. This happens because currently only the raw receive path checks for discrepancies between the dnode bonus length and the spill pointer flag. In case we encounter a dnode whose bonus length is greater than the predicted one, we should report an error. Modify in this regard dnode_sync() with an assertion at the end, dump_dnode() to error out, dsl_scan_recurse() to report errors during a scrub, and zstream to report a warning when dumping. Also added a test to verify spill blocks are sent correctly in a raw send. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes openzfs#12720 Closes openzfs#13014
In files created/modified before 4254acb there may be a corruption of xattrs which is not reported during scrub and normal send/receive. It manifests only as an error when raw sending/receiving. This happens because currently only the raw receive path checks for discrepancies between the dnode bonus length and the spill pointer flag. In case we encounter a dnode whose bonus length is greater than the predicted one, we should report an error. Modify in this regard dnode_sync() with an assertion at the end, dump_dnode() to error out, dsl_scan_recurse() to report errors during a scrub, and zstream to report a warning when dumping. Also added a test to verify spill blocks are sent correctly in a raw send. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes openzfs#12720 Closes openzfs#13014
In files created/modified before 4254acb there may be a corruption of xattrs which is not reported during scrub and normal send/receive. It manifests only as an error when raw sending/receiving. This happens because currently only the raw receive path checks for discrepancies between the dnode bonus length and the spill pointer flag. In case we encounter a dnode whose bonus length is greater than the predicted one, we should report an error. Modify in this regard dnode_sync() with an assertion at the end, dump_dnode() to error out, dsl_scan_recurse() to report errors during a scrub, and zstream to report a warning when dumping. Also added a test to verify spill blocks are sent correctly in a raw send. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes openzfs#12720 Closes openzfs#13014
In files created/modified before 4254acb there may be a corruption of xattrs which is not reported during scrub and normal send/receive. It manifests only as an error when raw sending/receiving. This happens because currently only the raw receive path checks for discrepancies between the dnode bonus length and the spill pointer flag. In case we encounter a dnode whose bonus length is greater than the predicted one, we should report an error. Modify in this regard dnode_sync() with an assertion at the end, dump_dnode() to error out, dsl_scan_recurse() to report errors during a scrub, and zstream to report a warning when dumping. Also added a test to verify spill blocks are sent correctly in a raw send. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes openzfs#12720 Closes openzfs#13014
In files created/modified before 4254acb there may be a corruption of xattrs which is not reported during scrub and normal send/receive. It manifests only as an error when raw sending/receiving. This happens because currently only the raw receive path checks for discrepancies between the dnode bonus length and the spill pointer flag. In case we encounter a dnode whose bonus length is greater than the predicted one, we should report an error. Modify in this regard dnode_sync() with an assertion at the end, dump_dnode() to error out, dsl_scan_recurse() to report errors during a scrub, and zstream to report a warning when dumping. Also added a test to verify spill blocks are sent correctly in a raw send. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes openzfs#12720 Closes openzfs#13014
System information
Describe the problem you're observing
I am trying to transfer an encrypted dataset to another pool using a raw send. Partially through this send, the receive side fails with an unclear error message.
The receiving system is also on 2.0.6, running Debian Bullseye on kernel 5.10.0-9.
Here are the commands that reproduce the issue reliably:
The dataset is about 1.3TB; at the 360GB mark, the receive reliably fails with the following error:
Oddly, if you provide the -s flag to the recv (enable resume support) the error changes to this:
I tried sending the stream through zstreamdump, which did not complain. I also maximized debug logging, generated a resume token and tried resuming (right at whatever transaction caused the recv to fail), which generated these messages (gist). It's noisy; see line 1786 for when receive_object failed.
I don't know the exact steps to reproduce this problem with another dataset. Originally this dataset was on another pool in an unencrypted state. It was sent into the encrypted pool just fine, then I wanted to raw-send the newly-encrypted dataset offsite to another pool. It is at this point that I ran into this show-stopper.
See also the gist below which is the output of
zstreamdump -vvv
on the resumed send.https://gist.github.com/putnam/3a467fdab6fe96ca048264c1bcd8a4b0
UPDATE: In the discussion below, I was able to shrink the affected dataset down to a ~70MB repro. See this comment for an attachment.
The text was updated successfully, but these errors were encountered: