Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manifest list encryption #7770

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

ggershinsky
Copy link
Contributor

No description provided.

@ggershinsky ggershinsky marked this pull request as draft June 5, 2023 06:10
@ggershinsky ggershinsky force-pushed the manifest-list-encryption branch from aba7650 to 3162f9e Compare March 26, 2024 05:37
@@ -162,6 +162,15 @@ default Iterable<DeleteFile> removedDeleteFiles(FileIO io) {
*/
String manifestListLocation();

/**
* Return the size of this snapshot's manifest list. For encrypted tables, a verified plaintext
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix comment

@ggershinsky ggershinsky force-pushed the manifest-list-encryption branch from 3162f9e to a2d7b10 Compare March 27, 2024 06:38
@ggershinsky ggershinsky marked this pull request as ready for review March 27, 2024 07:07
@@ -162,6 +162,25 @@ default Iterable<DeleteFile> removedDeleteFiles(FileIO io) {
*/
String manifestListLocation();

/**
* Return the size of this snapshot's manifest list file. Must be a verified value, taken from a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused here, we have a default of -1 as well set in base Snapshot which seemed to also be allowed as an "unset". Should we mention that here or is it always required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can define this field to be required only for encrypted tables. It will be not set in the snapshot file for unencrypted tables - where this method can return 0 (or -1, I'll make it consistent across all implementation classes).

if (manifestListKeyMetadata != null) { // encrypted manifest list file
Preconditions.checkArgument(
fileIO instanceof EncryptingFileIO,
"No encryption in FileIO class " + fileIO.getClass());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot read manifest list (%s) because it is encrypted but the configured FileIO (%s) does not implement EncryptingFileIO)

EncryptingFileIO encryptingFileIO = (EncryptingFileIO) fileIO;
Preconditions.checkArgument(
encryptingFileIO.encryptionManager() instanceof StandardEncryptionManager,
"Encryption manager for encrypted manifest list files can currently only be an instance of "
Copy link
Member

@RussellSpitzer RussellSpitzer Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot decrypt manifest list (%s) because the encryption manager (%s) does not implement StandardEncryptionManager

generator.writeStringField(MANIFEST_LIST_KEY_METADATA, snapshot.manifestListKeyMetadata());
}

// TODO discuss: do we need to sign the size value? Or sign the whole snapshot?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this attack work? Wouldn't the user also need the key to encrypt the replacement files? I thought we were storing the metadata.json key in the catalog so an attacker could replace everything but still not be able to trick a client using the catalog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the question. Some thoughts on the scenarios and protection options:

  • currently, we don't have a metadata.json key. We have only a key for snapshot's manifest list file. Besides using it for encrypting the manifest list file, we can also use this key for signing snapshot's sensitive parts like the manifest list size field. Or for signing the whole metadata.json file (should be possible with some effort) - then we also protect the integrity of e.g. the table properties (like the table key id).
  • snapshot (metadata.json file) doesn't keep secret values, so encrypting it might not be required. The signatures, mentioned above, would be kept in added snapshot fields - sufficient for detecting the file modification attacks.
  • these protection techniques are not required with the REST catalog - because we trust the catalog service (we don't trust the storage service). Since the whole snapshot is stored in the REST catalog, we don't need to sign anything.
  • the manifest list key is not stored in the catalog. Instead, it is wrapped in a KMS with the table master key, and stored in the snapshot MANIFEST_LIST_KEY_METADATA field. Only the KMS-authorized (for the table key) users/processes will be able to get the manifest list key.
  • In catalogs other than the REST, the signatures provide a partial protection - because the metadata.json is kept in the untrusted storage. With the signatures, it can't be modified. But the whole folder can be replaced (e.g. a replay attack - where all table files are removed, and replaced with files of an older version of the table). To prevent this attack in non-REST catalogs, we will have to update the catalog per each table snapshot (setting eg the latest table version/sequence number, or a random AAD prefix)

&& encryptedManifestList.keyMetadata().buffer() != null) {
Preconditions.checkArgument(
encryptionManager instanceof StandardEncryptionManager,
"Encryption manager for encrypted manifest list files can currently only be an instance of "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment as above. "Cannot X because Y"

private static final long EXISTING_ROWS = 857273L;
private static final int DELETED_FILES = 1;
private static final long DELETED_ROWS = 22910L;
private static final List<ManifestFile.PartitionFieldSummary> PARTITION_SUMMARIES =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need a test example which has a non-empty list of partition field summaries


@Test
public void testV2Write() throws IOException {
ManifestFile manifest = writeAndReadManifestList();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: writeAndReadEncryptedManifestList

public void testV2Write() throws IOException {
ManifestFile manifest = writeAndReadManifestList();

// all v2 fields should be read correctly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert J has some helper for this, Not sure if it is correct

assertThat(actual).usingRecursiveComparison().isEqualTo(expected);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// TODO discuss: do we need to sign the size value? Or sign the whole snapshot?
// Or rely on REST catalog? - the only option that prevents "full folder replacement" attack.
if (snapshot.manifestListSize() >= 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another small question here, we essentially are doing a transform here

manifestlists sizes < 0 become 0.

Also nit: we are also ignoring 0's that get passed through although we will read this as 0 if it is missing.

Just wondering what the intent here is. I think it may be better to have a defined missing value? Not sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, part of this thread #7770 (comment)

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it important that we store the manifest list size? Won't the encryption be enough to prove the file is the right one?

@ggershinsky
Copy link
Contributor Author

Yep, this is due to https://github.com/apache/iceberg/blob/main/format/gcm-stream-spec.md#file-length . There are options for table modification attacks if this field is not (safely) stored.

this.v1ManifestLocations = v1ManifestLocations;
this.manifestListKeyMetadata = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as comment above (group manifest vars)

* In encrypted tables, return the size of this snapshot's manifest list file. Must be a verified
* value, taken from a trusted source. In unencrypted tables, can return 0.
*/
default long manifestListSize() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than adding new methods for each piece of new information we want to pass, what about adding a ManifestList object that contains the location, size, and key metadata?

@ggershinsky ggershinsky force-pushed the manifest-list-encryption branch from 6c7f5a1 to 5486f6f Compare May 26, 2024 06:41
@ggershinsky ggershinsky reopened this May 28, 2024
@@ -162,6 +162,16 @@ default Iterable<DeleteFile> removedDeleteFiles(FileIO io) {
*/
String manifestListLocation();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is String manifestListLocation() same as what is stored in ManifestListFile? If so, do we still need it as a separate field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdblue @RussellSpitzer what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are dozens of calls to this method today (tests etc), so we'll likely need to keep it for a while.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I saw that. We can consolidate in a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. We need to keep this because it is part of the public API and used in many cases.

ByteBuffer.wrap(
Base64.getDecoder().decode(snapshot.manifestListFile().wrappedKeyMetadata()));

NativeEncryptionKeyMetadata wrappedKeyMetadata =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I'll remove these lines.

Comment on lines 179 to 181
ByteBuffer manifestListKeyMetadata = null;
ByteBuffer wrappedManifestListKeyMetadata = null;
String wrappedKeyEncryptionKey = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Declare these inside if (node.has(MANIFEST_LIST_KEY_METADATA)) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are used later, in the BaseSnapshot constructor. We can use a different constructor (if no manifest list encryption), but the code won't be more compact.

return size;
}

public String wrappedKeyEncryptionKey() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add javadoc: "Manifest list keys are encrypted with a table "key encryption key". This function returns a KMS wrap of the key encryption key.

@rdblue
Copy link
Contributor

rdblue commented Sep 16, 2024

@ggershinsky, I'm not too concerned with the size of the cache. I'm okay with 1 day, but that seems like a long time to have unencrypted key material in memory. I'll defer to your judgement here.

@ggershinsky
Copy link
Contributor Author

Ok. We don't have clear guidelines on key caching in memory (key copies are spread all over the process memory - cache, plug-in KMS client code, an HTTP library in the KMS client code; the Java GC - so there are no guarantees for when an uncached key is deleted from memory, if ever, before the process stops). But I agree a day could be too long. I'll change it to 1 hour - might be a reasonable trade-off between performance requirements (KMS call overhead), and safety requirements (there is a chance a key will be deleted from the memory within a business day).

@ggershinsky
Copy link
Contributor Author

Hi @rdblue , I've built the integration code with the latest version of this patch, works ok. Can we merge this PR?

@RussellSpitzer
Copy link
Member

@rdblue Did you have any more comments on this one? I can do another pass as well but I'd like to finish this up as well soon

@ggershinsky
Copy link
Contributor Author

@rdblue @RussellSpitzer All current comments should have been addressed in this thread and in the last commit. An additional review round is always welcome, I too would like to complete this feature.

* @param key unwrapped snapshot key bytes
* @param snapshotId ID of the table snapshot
* @param keyMetadata unencrypted EncryptionKeyMetadata
* @return a Pair of the key ID used to encrypt and the encrypted key metadata
Copy link
Contributor

@smaheshwar-pltr smaheshwar-pltr Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @return a Pair of the key ID used to encrypt and the encrypted key metadata
* @return the encrypted key metadata

(nit)

@caushie-akamai
Copy link

Hi! Are there any plans to have this feature merged in 1.8.0? This would be extremely helpful

@rshkv
Copy link
Contributor

rshkv commented Dec 17, 2024

+1, we have some compliance requirements that prevent us from adopting Iceberg without client-side encryption.

@ggershinsky
Copy link
Contributor Author

^^ cc @rdblue @RussellSpitzer

@RussellSpitzer RussellSpitzer added this to the Iceberg 1.8.0 milestone Jan 15, 2025
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is pretty close, had some comments on deprecation comments that needed to be updated. It seems like the main point that may be lingering is around the decryptKeyMetadata(EncryptionManager) interface but I don't think there's a good alternative to that cc @rdblue

Comment on lines 153 to 125
* @deprecated will be removed in 1.8.0; use {@link #unwrapKey(String)}} instead.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, if this PR doesn't have any more comments and we think we can get it in for 1.8 we should update this to be "will be removed in 1.9.0"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the slack discussion, the target release is not yet finalized.

@@ -81,22 +136,75 @@ private SecureRandom workerRNG() {
return lazyRNG;
}

/**
* @deprecated will be removed in 1.8.0; use {@link #currentSnapshotKeyId()} instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as below

Comment on lines +264 to +291
if (writer != null) {
try {
writer.close(); // must close before getting file length
} catch (IOException e) {
throw new RuntimeIOException(e, "Failed to close manifest list file writer");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this part need to change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see we need to explicitly flush so that by the time "toManifestListFile" is called we are guaranteed to have a length to work with

}

@Override
public ByteBuffer decryptKeyMetadata(EncryptionManager em) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure if there's an alternative to passing the EncryptionManager at least with the current design of EncryptingFileIO

@amogh-jahagirdar
Copy link
Contributor

I'd also suggest to make sure we rebase because it's been a while since this was open, and we should see CI pass with the rebased changes before merging.

@ggershinsky
Copy link
Contributor Author

I'd also suggest to make sure we rebase because it's been a while since this was open, and we should see CI pass with the rebased changes before merging.

Sure, I'll rebase this after Iceberg 1.8 is out (per the slack discussion)

@rdblue rdblue removed this from the Iceberg 1.8.0 milestone Jan 22, 2025
@gumartinm
Copy link

If you need help with this pull request @ggershinsky, perhaps I could help you.

ggershinsky and others added 8 commits February 24, 2025 08:12
update snapshot producer

update the patch

clean up

fix for previous patch

address review comments

move key wrapping to metadata encryption

encrypt manifest list key metadata

new aad util

null key needs no encryption

comment; clearer method/var names

use key encryption key for manifest list keys

add encryption util changes

update EncryptionTestHelpers

handle api change

remove unused lines

revert revapi.yml

KEK cache

unitest update

rename var

address review comments

fix timeout default

change writer kek timeout default

Updates from review.

cache unwrapped keys
Co-authored-by: Ryan Blue <rdblue@gmail.com>
@ggershinsky ggershinsky force-pushed the manifest-list-encryption branch from dcb5e4d to 39363fa Compare February 24, 2025 06:16
@ggershinsky
Copy link
Contributor Author

This PR is rebased and synced with the spec patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

9 participants