-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encryption and Raw Send Stability Improvements #6864
Conversation
cmd/zpool/zpool_main.c
Outdated
case ZPOOL_ERRATA_ZOL_6845_ENCRYPTION: | ||
(void) printf(gettext("action: To correct the issue " | ||
"revert to an earlier version and backup\n\tand " | ||
"destroy any existing encrypted datasets before " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think existing
is redundant (same in show_import()).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, see show_import comment.
cmd/zpool/zpool_main.c
Outdated
"corrected. Revert to an eariler" | ||
"version and\n\tbackup and destroy any " | ||
"existing encrypted datasets\n\tbefore " | ||
"updating.\n")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't seem consistent with the PR description - can these old encrypted datasets be mounted readonly? If so then it shouldn't be necessary to revert to an earlier version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message needs to up updated, version zero encrypted datasets can be mounted read-only. How about something like this which is inline with the information posted about the errata.
action: The pool can be imported but due to an on-disk incompatibility existing
encrypted datasets can only be mounted read-only. After import affected
datasets can be be converted to the new version by using 'zfs send' and
'zfs receive'.
include/sys/zfs_ioctl.h
Outdated
@@ -219,7 +219,8 @@ typedef struct dmu_replay_record { | |||
uint8_t drr_flags; | |||
uint32_t drr_raw_bonuslen; | |||
uint64_t drr_toguid; | |||
/* only nonzero for raw streams */ | |||
/* only (possibly) nonzero for raw streams */ | |||
uint64_t drr_maxblkid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're OK with moving the location of the following fields because we're breaking send/recv compatibility anyway? How is the send stream incompatibility detected on the receiving side? E.g. new send stream feature flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tcaputi can you comment on this. Since this send stream format never was part of any tagged release we have some flexibility here about about changing it.
cmd/zpool/zpool_main.c
Outdated
case ZPOOL_ERRATA_ZOL_6845_ENCRYPTION: | ||
(void) printf(gettext(" action: Existing " | ||
"encrypted datasets contain an on-disk " | ||
"incompaitibility which\n\tneeds to be " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/incompaitibility/incompatibility,/
module/zfs/zio_crypt.c
Outdated
ASSERT3U(version, ==, ZIO_CRYPT_KEY_CURRENT_VERSION); | ||
|
||
/* | ||
* The hole_birth feature might set these fields even if this is bp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/this is bp/this bp/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tcaputi please fix.
426a446
to
64148a5
Compare
cmd/zpool/zpool_main.c
Outdated
"corrected. Revert to an eariler" | ||
"version and\n\tbackup and destroy any " | ||
"existing encrypted datasets\n\tbefore " | ||
"updating.\n")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message needs to up updated, version zero encrypted datasets can be mounted read-only. How about something like this which is inline with the information posted about the errata.
action: The pool can be imported but due to an on-disk incompatibility existing
encrypted datasets can only be mounted read-only. After import affected
datasets can be be converted to the new version by using 'zfs send' and
'zfs receive'.
cmd/zpool/zpool_main.c
Outdated
case ZPOOL_ERRATA_ZOL_6845_ENCRYPTION: | ||
(void) printf(gettext("action: To correct the issue " | ||
"revert to an earlier version and backup\n\tand " | ||
"destroy any existing encrypted datasets before " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, see show_import comment.
include/sys/dmu_objset.h
Outdated
@@ -217,6 +217,7 @@ boolean_t dmu_objset_userobjused_enabled(objset_t *os); | |||
boolean_t dmu_objset_userobjspace_upgradable(objset_t *os); | |||
void dmu_objset_userobjspace_upgrade(objset_t *os); | |||
boolean_t dmu_objset_userobjspace_present(objset_t *os); | |||
boolean_t dmu_objset_bad_encryption_version(objset_t *os); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than dmu_objset_bad_encryption_version()
, which I admit is descriptive, let's make the generic interface return the version number even if we don't expect to have to bump this version again. A convenience wrapper function can be added to check if that version is compatible, or not, and include an explanation why. Calling it bad is misleading.
include/sys/zfs_ioctl.h
Outdated
@@ -219,7 +219,8 @@ typedef struct dmu_replay_record { | |||
uint8_t drr_flags; | |||
uint32_t drr_raw_bonuslen; | |||
uint64_t drr_toguid; | |||
/* only nonzero for raw streams */ | |||
/* only (possibly) nonzero for raw streams */ | |||
uint64_t drr_maxblkid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tcaputi can you comment on this. Since this send stream format never was part of any tagged release we have some flexibility here about about changing it.
module/zfs/dmu.c
Outdated
dnode_new_blkid(dn, maxblkid, tx, B_FALSE); | ||
rw_exit(&dn->dn_struct_rwlock); | ||
dnode_rele(dn, FTAG); | ||
return (err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: functionally it doesn't matter but it'd be nice to explicitly return (0)
here, the other newly added crypto functions written in the same style arguably should do the same as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this from dmu_object_set_nlevels()
and dmu_object_set_blocksize()
. Do you want me to change these too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please do. It looks like you got a little unlucky and happened to pick a non-typical example function. The code isn't as consistent as one might like about this.
module/zfs/dmu_objset.c
Outdated
@@ -663,6 +663,9 @@ dmu_objset_own_impl(dsl_dataset_t *ds, dmu_objset_type_t type, | |||
return (SET_ERROR(EINVAL)); | |||
} else if (!readonly && dsl_dataset_is_snapshot(ds)) { | |||
return (SET_ERROR(EROFS)); | |||
} else if (!readonly && decrypt && | |||
dsl_dir_bad_encryption_version(ds->ds_dir)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: what do you think about using the dmu_objset_*
version of the function here since we've just gotten the objset_t
with dmu_objset_from_ds()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how I feel about that. Its easy enough to add a wrapper, but the same could be said for many other things, such as the call to dsl_dataset_is_snapshot()
3 lines above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I don't feel strongly about this I'm happy with whatever you prefer.
module/zfs/dsl_crypt.c
Outdated
@@ -1952,6 +1970,12 @@ dsl_crypto_recv_key_check(void *arg, dmu_tx_t *tx) | |||
goto error; | |||
} | |||
|
|||
ret = nvlist_lookup_uint64(nvl, DSL_CRYPTO_KEY_VERSION, &version); | |||
if (ret != 0 || version != ZIO_CRYPT_KEY_CURRENT_VERSION) { | |||
/* don't receive old on-disk formats */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include an explanation of why this old format can't be received. Referencing the errata would be a reasonable way to handle this.
module/zfs/dsl_crypt.c
Outdated
@@ -2254,6 +2291,14 @@ dsl_crypto_populate_key_nvlist(dsl_dataset_t *ds, nvlist_t **nvl_out) | |||
if (ret != 0) | |||
goto error; | |||
|
|||
ret = zap_lookup(mos, dckobj, DSL_CRYPTO_KEY_VERSION, 8, 1, &version); | |||
if (ret != 0 || version != ZIO_CRYPT_KEY_CURRENT_VERSION) { | |||
/* We don't support raw sends of legacy on-disk formats */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add expand on why this isn't supported. I'd also pull the block comment up above the zap_lookup()
call.
module/zfs/zio_crypt.c
Outdated
ASSERT3U(version, ==, ZIO_CRYPT_KEY_CURRENT_VERSION); | ||
|
||
/* | ||
* The hole_birth feature might set these fields even if this is bp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tcaputi please fix.
tests/runfiles/linux.run
Outdated
@@ -644,7 +644,8 @@ tests = ['rsend_001_pos', 'rsend_002_pos', 'rsend_003_pos', 'rsend_004_pos', | |||
'send-c_lz4_disabled', 'send-c_recv_lz4_disabled', | |||
'send-c_mixed_compression', 'send-c_stream_size_estimate', 'send-cD', | |||
'send-c_embedded_blocks', 'send-c_resume', 'send-cpL_varied_recsize', | |||
'send-c_recv_dedup', 'send_encrypted_heirarchy', 'send_freeobjects'] | |||
'send-c_recv_dedup', 'send_encrypted_files', 'send_encrypted_heirarchy', | |||
'send_freeobjects'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test case needs to be added to check for this errata. You can create a tiny encrypted version 0 pool and include it with the ZTS for testing purposes. The test case should.
- Verify version=0 filesystems be mounted read-only but not read-write.
- Verify version=0 volumes can opened read-only but not read-write.
- Verify the recommended
zfs send | zfs recv
conversion process works as intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this. Would I just put the pool in the zpool_import
test dir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a similar thing is already done in the zpool import tests.
@behlendorf : github won't let me comment on some of your notes.....
What exactly is wrong here?
The short version is we needed to send the maxblkid before but we weren't. In testing this didnt matter, since we just created new files with new data and sent them so the maxblkid was always the same on both sides. The problem comes when you create a large file, truncate it, and then send it. |
635a1ce
to
6b58637
Compare
Codecov Report
@@ Coverage Diff @@
## master #6864 +/- ##
=========================================
+ Coverage 75.26% 75.37% +0.1%
=========================================
Files 296 296
Lines 95453 95332 -121
=========================================
+ Hits 71845 71853 +8
+ Misses 23608 23479 -129
Continue to review full report at Codecov.
|
So will I be able to |
@qinq-net |
I'm going to be testing a fresh pool from this patchset soonish |
@prometheanfire We are aware of one more issue with raw sends which we are working on (hence why this hasn't been merged yet). This should manifest as the |
ya, I'm aware that ymmv and all that, I can survive for a little bit without a send (or recover via other methods). |
@@ -187,6 +187,12 @@ | |||
(MIN(zfs_key_max_salt_uses, ZFS_KEY_MAX_SALT_USES_DEFAULT)) | |||
unsigned long zfs_key_max_salt_uses = ZFS_KEY_MAX_SALT_USES_DEFAULT; | |||
|
|||
typedef struct blkptr_auth_buf { | |||
uint64_t bab_prop; /* blk_prop - portable mask */ | |||
uint8_t bab_mac[ZIO_DATA_MAC_LEN]; /* MAC from blk_cksum */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line had a whitespace error (clrf error) ^M
The system I'm building and testing on may have problems booting since I'm testing with passphrase encryption. The only way I know of to load the key wrapped by a passphrase will also mount all the datasets, which is generally not desired in early boot. Is there a way to load the key via passphrase without mounting the datasets? edit: looks like I can |
|
There is also |
ya, figured that out. at the moment I have to manually load the key in dracut and recover to move on, but I do have a system booted at least.
Then I'm good. As far as actually using the patch, working so far... |
Not sure if it's related, but on shutdown I get the following. I only have a picture, no raw text unfortunately. |
Well couldn't try raw sending yet since I only have my homeserver using this patch as of now but I'll apply it to my notebook on the weekend. |
When performing zil_claim() at pool import time, it is important that encrypted datasets set os_next_write_raw before writing to the zil_header_t. This prevents the code from attempting to re-authenticate the objset_phys_t when it writes it out, which is unnecessary because the zil_header_t is not protected by either objset MAC and impossible since the keys aren't loaded yet. Unfortunately, one of the code paths did not set this flag, which causes failed ASSERTs during 'zpool import -F'. This patch corrects this issue. Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #6864 Closes #6916
Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #6821 Closes #6864
Currently, os_next_write_raw is a single boolean used for determining whether or not the next call to dmu_objset_sync() should write out the objset_phys_t as a raw buffer. Since the boolean is not associated with a txg, the work simply happens during the next txg, which is not necessarily the correct one. In the current implementation this issue was misdiagnosed, resulting in a small hack in dmu_objset_sync() which seemed to resolve the problem. This patch changes os_next_write_raw to be an array of booleans, one for each txg in TXG_OFF and removes the hack. Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #6864
Thank you for everyone's help on this! This PR is ready to go and has been merged. |
I have noticed this patch merged and have applied to my (testing) production system. I have smoothly upgraded my datasets. After upgrading I tried raw sends. Full send looks fine and no longer suffers from kernel oops, but incremental raw sends sometimes still fail to be mounted for By the way, I sent those datasets using something like |
Hmmmm. Do you have any steps to reproduce this? Or sample send files that cause the issue (that you don't mind sharing with me)? We had done fairly extensive testing for these kinds of issues before merging, but it's definitely possible that another case slipped by. Can you also confirm that BOTH the send and receive side are actually running the new code? You can do this by checking ‘cat /sys/module/zfs/version‘ and checking that the code at the end matches the commit hash. |
@tcaputi |
@lnicola I commented on each of the individual issues. |
@tcaputi Unfortunately the |
@qinq-net If you could give me some steps to reproduce the problem and confirm that you are running this patch on at least the receive side, that would be very helpful. We should probably make a new issue for this once you have that since this is a closed PR. I'll see what i can do on Monday. |
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. This patch also contains minor bug fixes and cleanups. Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#6845 Closes openzfs#6864 Closes openzfs#7052
When performing zil_claim() at pool import time, it is important that encrypted datasets set os_next_write_raw before writing to the zil_header_t. This prevents the code from attempting to re-authenticate the objset_phys_t when it writes it out, which is unnecessary because the zil_header_t is not protected by either objset MAC and impossible since the keys aren't loaded yet. Unfortunately, one of the code paths did not set this flag, which causes failed ASSERTs during 'zpool import -F'. This patch corrects this issue. Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#6864 Closes openzfs#6916
Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#6821 Closes openzfs#6864
Currently, os_next_write_raw is a single boolean used for determining whether or not the next call to dmu_objset_sync() should write out the objset_phys_t as a raw buffer. Since the boolean is not associated with a txg, the work simply happens during the next txg, which is not necessarily the correct one. In the current implementation this issue was misdiagnosed, resulting in a small hack in dmu_objset_sync() which seemed to resolve the problem. This patch changes os_next_write_raw to be an array of booleans, one for each txg in TXG_OFF and removes the hack. Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#6864
@qinq-net I think I have a fix for your problems. I will be making a PR soon. |
This is a port of 047116a - Raw sends must be able to decrease nlevels, to the zfs-0.7-stable branch. It includes the various fixes to the problem of receiving incremental streams which include reallocated dnodes in which the number of dnode slots has changed but excludes the parts which are related to raw streams. From 047116a: Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Closes openzfs#6821 Closes openzfs#6864 Should close openzfs#6366
This is a port of 047116a - Raw sends must be able to decrease nlevels, to the zfs-0.7-stable branch. It includes the various fixes to the problem of receiving incremental streams which include reallocated dnodes in which the number of dnode slots has changed but excludes the parts which are related to raw streams. From 047116a: Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. Authored-by: Tom Caputi <tcaputi@datto.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Ported-by: Tim Chase <tim@chase2k.com> Closes openzfs#6821 Closes openzfs#6864 NOTE: This is the first of the port of 3 related patches patches to the zfs-0.7-release branch of ZoL. The other two patches should immediately follow this one.
This is a port of 047116a - Raw sends must be able to decrease nlevels, to the zfs-0.7-stable branch. It includes the various fixes to the problem of receiving incremental streams which include reallocated dnodes in which the number of dnode slots has changed but excludes the parts which are related to raw streams. From 047116a: Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. Authored-by: Tom Caputi <tcaputi@datto.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Ported-by: Tim Chase <tim@chase2k.com> Closes openzfs#6821 Closes openzfs#6864 NOTE: This is the first of the port of 3 related patches patches to the zfs-0.7-release branch of ZoL. The other two patches should immediately follow this one.
This is a port of 047116a - Raw sends must be able to decrease nlevels, to the zfs-0.7-stable branch. It includes the various fixes to the problem of receiving incremental streams which include reallocated dnodes in which the number of dnode slots has changed but excludes the parts which are related to raw streams. From 047116a: Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. Authored-by: Tom Caputi <tcaputi@datto.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Ported-by: Tim Chase <tim@chase2k.com> Closes openzfs#6821 Closes openzfs#6864 NOTE: This is the first of the port of 3 related patches patches to the zfs-0.7-release branch of ZoL. The other two patches should immediately follow this one.
This is a port of 047116a - Raw sends must be able to decrease nlevels, to the zfs-0.7-stable branch. It includes the various fixes to the problem of receiving incremental streams which include reallocated dnodes in which the number of dnode slots has changed but excludes the parts which are related to raw streams. From 047116a: Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. Authored-by: Tom Caputi <tcaputi@datto.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Ported-by: Tim Chase <tim@chase2k.com> Closes openzfs#6821 Closes openzfs#6864 NOTE: This is the first of the port of 3 related patches patches to the zfs-0.7-release branch of ZoL. The other two patches should immediately follow this one.
This is a port of 047116a - Raw sends must be able to decrease nlevels, to the zfs-0.7-stable branch. It includes the various fixes to the problem of receiving incremental streams which include reallocated dnodes in which the number of dnode slots has changed but excludes the parts which are related to raw streams. From 047116a: Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. This also includes a one-liner fix from loli10K to fix a test failure: openzfs#7792 (comment) Authored-by: Tom Caputi <tcaputi@datto.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Ported-by: Tim Chase <tim@chase2k.com> Closes openzfs#6821 Closes openzfs#6864 NOTE: This is the first of the port of 3 related patches patches to the zfs-0.7-release branch of ZoL. The other two patches should immediately follow this one. Add stuff
This is a port of 047116a - Raw sends must be able to decrease nlevels, to the zfs-0.7-stable branch. It includes the various fixes to the problem of receiving incremental streams which include reallocated dnodes in which the number of dnode slots has changed but excludes the parts which are related to raw streams. From 047116a: Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. This also includes a one-liner fix from loli10K to fix a test failure: openzfs#7792 (comment) Authored-by: Tom Caputi <tcaputi@datto.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Ported-by: Tim Chase <tim@chase2k.com> Closes openzfs#6821 Closes openzfs#6864 NOTE: This is the first of the port of 3 related patches patches to the zfs-0.7-release branch of ZoL. The other two patches should immediately follow this one.
This is a port of 047116a - Raw sends must be able to decrease nlevels, to the zfs-0.7-stable branch. It includes the various fixes to the problem of receiving incremental streams which include reallocated dnodes in which the number of dnode slots has changed but excludes the parts which are related to raw streams. From 047116a: Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. This also includes a one-liner fix from loli10K to fix a test failure: openzfs#7792 (comment) Authored-by: Tom Caputi <tcaputi@datto.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Ported-by: Tim Chase <tim@chase2k.com> Closes openzfs#6821 Closes openzfs#6864 NOTE: This is the first of the port of 3 related patches patches to the zfs-0.7-release branch of ZoL. The other two patches should immediately follow this one.
The current on-disk format for encrypted datasets protects
not only the encrypted and authenticated blocks, but also
the order and interpretation of these blocks. In order to
make this work while maintaining the ability to do raw sends
the indirect bps maintain a secure checksum of all the MACs
in the block below it, along with a few other fields that
determine how the data is interpretted.
Unfortunately, the current on-disk format erroniously
includes some fields which are not portable and thus cannot
support raw sends. It is also not possible to easily work
around this issue due to a separate and much smaller bug
which causes indirect blocks for encrypted dnodes to not
be compressed, which conflicts with the previous bug. In
addition, raw send streams do not currently include
dn_maxblkid which is needed in order to ensure that we are
correctly maintaining the portable objset MAC.
This patch zero's out the offending fields when computing the
bp MAC (as they should have been) and registers an errata for
the on-disk format bug. We detect the errata by adding a
"version" field to newly created DSL Crypto Keys. We allow
datasets without a version (version 0) to only be mounted for
read so that they can easily be migrated. We also now include
dn_maxblkid in raw send streams to ensure the MAC can be
maintained correctly.
Note that this fix has not yet been finalized and should not be used until it is tested, reviewed, and merged unless you are ok with losing your data.
Signed-off-by: Tom Caputi tcaputi@datto.com
How Has This Been Tested?
I have added a new test for raw sends that essentially stresses as many edge cases as I could think of. In addition, I have manually tested that the recovery process laid out in zfsonlinux/zfsonlinux.github.com#35 works as advertised, and that both old and new datasets function predictably.
Types of changes
Checklist:
Signed-off-by
.