-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encrypted Indirect BPs erroneously MAC byteorder and compression bits #6845
Comments
Something about eggs and omelettes...? The likelyhood of having to merc existing pools isnt great, but that's why its not in 0.7.x. |
Dropping encrypted filesystems is good enough. I just worry about people doing something like sending their data to an unencrypted dataset and then sending it back after patching since technically this writes out all of their data in plaintext and zfs doesn't have a secure delete functionality yet. We did delay the tagging of encryption until 0.8.0 for this reason specifically, but I still feel bad for everyone who has been helping to test it. |
Don't feel bad, you rolled the bloody boulder up the hill for us, a few crushed toes are perfectly fine along the way. The more testing this gets the better, but data which needs crypto is often critical and thus not suited for testing long term. Kind of a chicken and egg issue. I for one am proper screwed, probably about 10t of backups which will need to be un-f'd. Luckily all tier3, so not primary restore sources in case we lose a host or pool. They're atop dm-crypt anyway until we know with a bit more certainty that this is safe, but its going to be a fun week of send/recv.
Re secure discard, what ever happened to forcing TRIM anyway? Thought for sure that'd land in 0.7.0.
|
There are still some people working on it, but I'm not sure what happened to it upstream..... |
I have a productive system with encryption patches from Sep '16 running and would migrate to the current git master the next few weeks (I've a data mirror of all zfs data on a luks ext4 volume). Good to see that there is an pending issue, I will wait until this issue is fixed. May it possible to add a github issue tag für issues which will break ODS? I don't look everyday into the zfs issue tracker and a tag would allow early adaptors to check if there are ODS issues before switching to a new code revision. If 0.8.0 is expected in 3-5 months, we don't need a tag. But if it may take 1-2 years until 0.8.0 with crypto is released, it will really help. |
@cytrinox we could add a new tags for PRs which change the on-disk format but I'm not sure how helpful it would be. To be clear the only time we change the on-disk format is when introducing a new feature flag, and we do our best to ensure those changes have been finalized before the PR is merged. To date this is the first time we've changed the format after merging a PR which adds a feature flag. And it was only an option because the feature has not yet been included in any tagged release. |
@behlendorf then my request for a tag is nonsense. |
The current on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends the indirect bps maintain a secure checksum of all the MACs in the block below it, along with a few other fields that determine how the data is interpretted. Unfortunately, the current on-disk format erroniously includes the byteorder and compression of the blocks below, which is not portable and thus cannot support raw sends. Unfortunately, it is also not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for dnodes to not be compressed. This patch zero's out the byteorder and compression when computing the MAC (as they should have been) and registers an errata for the on-disk format bug. Signed-off-by: Tom Caputi <tcaputi@datto.com>
@tcaputi There nothing to be worried about that some users (including me) now have some work to do. In the end we all know or should know) that this can happen in master. What is currently a bit unclear to me is the fix. I see the fix for #6845 has been merged into master. So obviously the first step is to upgrade to master. |
It has not. I have not made the PR yet. That branch is where I am staging my work for this but it is not done yet. Most notably, it still needs some decisions about how to handle reporting the errata to the user and a new test to ensure this doesn't get broken once its fixed.
Unfortunately its a little more complicated than this. You can only read the data on the old software version and the problem won't be corrected until the new version. So you will actually have to move it somewhere else, delete the encrypted datasets, upgrade your software and copy it back. This isn't ideal because for complete security you wouldn't want to put your encrypted data in plaintext on the pool (since zfs doesn't currently support secure deletion), so you need separate storage for this elsewhere. I don't really have a better answer for this at the moment.
I think you meant unencrypted, but yes. As a part of this PR, I am making a related PR to the ZoL website to include information about this errata and what the recommended actions are. |
@tcaputi Thanks for the clarification. Yes, i mean unencrypted and not uncompressed. Since it has not yet been merged into master: Could you please give us a heads-up here when we can start migrating (however painful it will be)? |
I will. @behlendorf and I might have a way to make this a lot less painful (you'd just have to do |
The current on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends the indirect bps maintain a secure checksum of all the MACs in the block below it, along with a few other fields that determine how the data is interpretted. Unfortunately, the current on-disk format erroniously includes some fields which are not portable and thus cannot support raw sends. It is also not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, raw send streams do not currently include dn_maxblkid which is needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC (as they should have been) and registers an errata for the on-disk format bug. We detect the errata by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Signed-off-by: Tom Caputi <tcaputi@datto.com>
The current on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends the indirect bps maintain a secure checksum of all the MACs in the block below it, along with a few other fields that determine how the data is interpretted. Unfortunately, the current on-disk format erroniously includes some fields which are not portable and thus cannot support raw sends. It is also not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, raw send streams do not currently include dn_maxblkid which is needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC (as they should have been) and registers an errata for the on-disk format bug. We detect the errata by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Signed-off-by: Tom Caputi <tcaputi@datto.com>
The current on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends the indirect bps maintain a secure checksum of all the MACs in the block below it, along with a few other fields that determine how the data is interpretted. Unfortunately, the current on-disk format erroniously includes some fields which are not portable and thus cannot support raw sends. It is also not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, raw send streams do not currently include dn_maxblkid which is needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC (as they should have been) and registers an errata for the on-disk format bug. We detect the errata by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Signed-off-by: Tom Caputi <tcaputi@datto.com>
I just pushed a PR (#6864) for this issue. Please note that it is not yet complete and it should not be used (apart from testing purposes) until it is merged. |
The current on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends the indirect bps maintain a secure checksum of all the MACs in the block below it, along with a few other fields that determine how the data is interpretted. Unfortunately, the current on-disk format erroniously includes some fields which are not portable and thus cannot support raw sends. It is also not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, raw send streams do not currently include dn_maxblkid which is needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC (as they should have been) and registers an errata for the on-disk format bug. We detect the errata by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Signed-off-by: Tom Caputi <tcaputi@datto.com>
@tcaputi said:
This sounds interesting. Would this mean you could keep the same pool, keep the unencrypted volumes, and just transfer the encrypted volumes to new volumes within the same pool, on the same system, using a single ZoL version? Effectively making said new ZoL version able to read both old broken and new fixed type encrypted storage? I would love to see this trick make it into the eventual future tagged release that officially supports encryption. I'm ashamed to admit that during testing, encryption worked so beautifully, I let my dataset grow beyond what I initially bargained for. I'm considering keeping this around until the final tagged version makes it to mainline. |
@Redsandro |
@tcaputi I will leave it alone until it is tagged together with the final version of native encryption.
I also notice that pool version is empty:
If this is not a bug but not a redundant parameter either, it could be improved with a message saying why the version is empty and what could be done to determine the compatibility with a certain zfs version on a certain server alternatively. |
The version that you are looking at is the ZPL version, which is different from the encryption version that we are adding here. The ZPL version determines how objects in a ZFS dataset relate to each other to present a filesystem. The encryption version refers to how these objects are protected.
The pool version is essentially deprecated (although we can never really get rid of it). Since OpenZFS became an open source project, the latest version of a ZFS pool is 5000 (which shows up as blank as you have seen). Instead, we now use feature flags these days which are a bit more conducive to having many developers and companies work on the project at once. This is documented in the man pages. For the moment we are not planning on exposing the encryption version. Since ZFS native encryption is still not in a tagged release, we are handling the old format by calling it an on-disk errata. Unlike older ZPL versions which were functional, the version 0 encryption implementation cannot work the way it was intended in all circumstances, so we don't want to support it (beyond allowing users to fix the problem) going forward. |
@tcaputi thank you for the elaborate response. Just curious if the "allowing users to fix the problem" PR will be only in a non-tagged release for fixing purposes (to which we should pay close attention), or if it is planned to be available with the finalized encryption in a tagged version. |
It will be in the tagged release and maintained going forward. |
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. Fixes openzfs#6845 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. This patch also contains minor bug fixes and cleanups. Fixes openzfs#6845 Fixes openzfs#7052 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. This patch also contains minor bug fixes and cleanups. Fixes openzfs#6845 Fixes openzfs#7052 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. This patch also contains minor bug fixes and cleanups. Fixes openzfs#6845 Fixes openzfs#7052 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. This patch also contains minor bug fixes and cleanups. Fixes openzfs#6845 Fixes openzfs#7052 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. This patch also contains minor bug fixes and cleanups. Fixes openzfs#6845 Fixes openzfs#7052 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. This patch also contains minor bug fixes and cleanups. Fixes openzfs#6845 Fixes openzfs#7052 Signed-off-by: Tom Caputi <tcaputi@datto.com>
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. This patch also contains minor bug fixes and cleanups. Fixes openzfs#6845 Fixes openzfs#7052 Signed-off-by: Tom Caputi <tcaputi@datto.com>
Moved to archzfs/archzfs#222 |
The #6864 stability patch did fix a lot of issues with encryption. For me things work well with the integrated patch. I haven't tested raw sending so far. Also I'd recommend that you have at least two live usb drives available. One that loads the "old" zfs without the stability patch and one that loads the new zfs with the stability patch. The reason is that when you load the system with the new zfs, the old encrypted datasets become read-only. While with the old live usb the old encrypted datasets can be still accessed as normal but you can't open the new encrypted ones. I had to use this because one dataset was too big. So I did rsync from the ro-mounted old one parts.... booted into the old zfs usb.... removed the rsynced part... and booted back into new zfs and rsynced the rest. |
Hi @sjau
This is actually really clever. I was planning to use Timeshift to sort of go back and forth in order to do some testing and comparing file hashes, but I haven't ever gone back so I'm unsure how reliable it is. Can you recommend a way to get a stick with ZFS as simple as possible? Did you wrote an automation for personal use that you can share?
|
give me your keyboard layout and I can generate isos for you... I use NixOS and it has some really great features (reproducable builts, atomic upgrades yadda yadda yadda) but it's a bitch trying to package new software for it.... Those will be "installer" isos but basically it boots up and drops you to root shell so that you could then actually setup nixos etc... but you can also do partitioning and stuff and play with zfs just nicely. As for the pool: In nixos it was recomended to make a container for the encrypted datasets like tank/encryption/nixos. Sine top-level dataset "tank" isn't encrypted, you could just create a new encryption dataset, load the zfs key for the old one as well and just do zfs send/recv.... this only works if you have enough storage in your pool left for that dataset :) (I had a 300GB dataset and only 150GB left on my notebook) |
@Redsandro The patch covered all of the issues that I am currently aware of and have been able to reproduce. Hopefully, this should be the last of the on-disk changes for this feature (although we now have a mechanism to deal with them if they arise in the future). There is currently one other encryption-related patch #7115 that I expect should be merged in the next few days. It fixes a small issue that we have only ever hit in ztest, which basically races as much code as possible against each other. My next biggest priority (with regards to encryption) is to implement support for
However, if you want to receive a new filesystem and encrypt it this won't work because |
@tcaputi said:
My first thought was file descriptors and/or process substitution, but after hacking in bash for approximately the time between your comment and this one, I haven't been able to pipe data and import a password @tcaputi said:
Speaking of this, (how) can I use a key pipe with the mount command? I'm tring to figure out the best way to mount an encrypted dataset on a server from a laptop where the key is only on the laptop. But I can't seem to get it to work. The following non-working command illustrates what I'm trying to accomplish.
|
I somehow fail to see the issue here. If you can zfs send / recv then you already have access to both sides... why not just load-key first and then just issue zfs send / recv command? |
Yeah, I was able to do some things, but nothing with a clean user interface.
I don't know what's up. This worked for me. (You should not use echo in production):
The problem is that receiving for the first time requires both the passphrase and the stream in the same command. I can't load the key first because the dataset for it doesn't exist yet. |
I seem to recall ssh forces password read from terminal.
…Sent from my iPhone
On Feb 3, 2018, at 12:29 PM, Tom Caputi ***@***.***> wrote:
My first thought was file descriptors and/or process substitution, but after hacking in bash for approximately the time between your comment and this one, I haven't been able to pipe data and import a password <(echo "like so") and keep them separate. I'm looking forward to hear what you've come up with.
Yeah, I was able to do some things, but nothing with a clean user interface.
Speaking of this, (how) can I use a key pipe with the mount command? I'm tring to figure out the best way to mount an encrypted dataset on a server from a laptop where the key is only on the laptop. But I can't seem to get it to work. The following non-working command illustrates what I'm trying to accomplish.
I don't know what's up. This worked for me. (You should not use echo in production):
echo 'password' | ssh ***@***.*** 'sudo zfs mount -l pool/encrypted'
I somehow fail to see the issue here. If you can zfs send / recv then you already have access to both sides... why not just load-key first and then just issue zfs send / recv command?
The problem is that receiving for the first time requires both the passphrase and the stream in the same command. I can't load the key first because the dataset for it doesn't exist yet.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. This patch also contains minor bug fixes and cleanups. Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#6845 Closes openzfs#6864 Closes openzfs#7052
Edit: Nevermind, I'll destroy the pool and try something else. Original message below. @tcaputi @behlendorf I have this dataset from back in 2017 giving problems during transfer for the purpose of updating, and then I remember we discussed this issue above. The faulty received sets cannot be removed. Can this be fixed, or do I need to destroy the whole pool, including the datasets that are fine? PS - The original data is fine and is also safely backed up. Just wondering if this is doable locally because it would safe me a lot of time. For more details, see #11661 |
While looking into #6806 I discovered 2 small errors with the on-disk format for encrypted datasets that present problems with regards to raw sends. Indirect BPs include a checksum-of-MACs of a few fields in all of the BPs below. The way this is supposed to work is that the checksum-of-MACs only protects fields which can be preserved when doing a raw
zfs send -w
. However, the bug is that compression and byte order are included in these MACs, which is not portable to other systems.On its own, this wouldn't be a big problem. We could simply adjust the on-disk format so that it overrides the real values with LZ4 compression and little endian byte order in all cases, since these 2 values are by far the mostly commonly used in production. This would mean virtually nobody would notice the on-disk format "change". Unfortunately, there is another much less serious bug where indirect dnode blocks are not getting compressed. The way that these 2 bugs interact would require us to always disable compression for encrypted indirect dnode blocks which could have a significant performance impact.
I am currently working on a patch to correct this issue, although it will almost definitely require breaking existing pools that are using encryption. I am creating this ticket to help people watch the progress on this issue and to try to address any concerns they may have.
The text was updated successfully, but these errors were encountered: