Skip to content

Commit

Permalink
image-rs: fix duplicated image layer detective logic
Browse files Browse the repository at this point in the history
Before this commit, when we pull images who have two encrypted layers
whose corresponding plaintext layers are same like
`quay.io/fidencio/prueba:encrypted`, ther will be an error like

```
index out of bounds: the len is 25 but the index is 25
```

This is caused by the deduplication logic inside image pull logic. On
one hand, it will delete duplicated layers recorded inside image
manifest, who reflectes the encrypted layers/blobs. On the other hand,
it will delete duplicated layers recorded inside the config.json, who
reflects the plaintext of the layers.

The image encryption logic will generate a random symmetric key for each
layer. Thus even the same plaintext layer would be encrypted into
different blobs/layers. Thus after deduplication, we might have more
layers for image manifest.

This patch changes the deduplicating logic, by only check the layer
digests inside image manifest, s.t. even if there are two same plaintext
layers, we will pull and decrypt both of them. It's ok to do some
optimization later if a fully analyzation is token.

Fies confidential-containers#840

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
  • Loading branch information
Xynnn007 committed Dec 11, 2024
1 parent 9f518e9 commit 3f8c601
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 16 deletions.
33 changes: 17 additions & 16 deletions image-rs/src/image.rs
Original file line number Diff line number Diff line change
Expand Up @@ -422,7 +422,7 @@ fn create_image_meta(
id: id.to_string(),
digest: image_digest.to_string(),
reference: image_url.to_string(),
image_config: ImageConfiguration::from_reader(image_config.to_string().as_bytes())?,
image_config: ImageConfiguration::from_reader(image_config.as_bytes())?,
..Default::default()
};

Expand All @@ -431,26 +431,27 @@ fn create_image_meta(
bail!("Pulled number of layers mismatch with image config diff_ids");
}

// Note that an image's `diff_ids` may always refer to plaintext layer
// digests. For two encryption layers encrypted from a same plaintext
// layer, the `LayersData.Digest` of the image manifest might be different
// because the symmetric key to encrypt is different, thus the cipher text
// is different. Interestingly in such case the `diff_ids` of the both
// layers are the same in the config.json.
// Another note is that the order of layers in the image config and the
// image manifest will always be the same, so it is safe to use a same
// index to lookup or mark a layer.
let mut unique_layers = Vec::new();
let mut digests = BTreeSet::new();
for l in &image_manifest.layers {
if digests.contains(&l.digest) {
continue;
}

digests.insert(&l.digest);
unique_layers.push(l.clone());
}

let mut unique_diff_ids = Vec::new();
let mut id_tree = BTreeSet::new();
for id in diff_ids {
if id_tree.contains(id.as_str()) {

let mut digests = BTreeSet::new();
for i in 0..diff_ids.len() {

Check failure on line 447 in image-rs/src/image.rs

View workflow job for this annotation

GitHub Actions / Check (stable, ubuntu-24.04)

the loop variable `i` is used to index `diff_ids`

Check failure on line 447 in image-rs/src/image.rs

View workflow job for this annotation

GitHub Actions / Check (1.76.0, ubuntu-24.04)

the loop variable `i` is used to index `diff_ids`

Check failure on line 447 in image-rs/src/image.rs

View workflow job for this annotation

GitHub Actions / Check (ubuntu-24.04, stable)

the loop variable `i` is used to index `diff_ids`
if digests.contains(&image_manifest.layers[i].digest) {
continue;
}

id_tree.insert(id.as_str());
unique_diff_ids.push(id.clone());
digests.insert(&image_manifest.layers[i].digest);
unique_layers.push(image_manifest.layers[i].clone());
unique_diff_ids.push(diff_ids[i].clone());
}

Ok((image_data, unique_layers, unique_diff_ids))
Expand Down
6 changes: 6 additions & 0 deletions image-rs/src/pull.rs
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,10 @@ impl<'a> PullClient<'a> {
};

let decryptor = Decryptor::from_media_type(&layer.media_type);

// There are two types of layers:
// 1. Comporessed layer = Compress(Layer Data)
// 2. Encrypted+Compressed layer = Compress(Encrypt(Layer Data))
if decryptor.is_encrypted() {
let decrypt_key = tokio::task::spawn_blocking({
let decryptor = decryptor.clone();
Expand Down Expand Up @@ -210,6 +214,8 @@ impl<'a> PullClient<'a> {
Ok(layer_meta)
}

/// Decompress and unpack layer data. The returned value is the
/// digest of the uncompressed layer.
async fn async_decompress_unpack_layer(
&self,
input_reader: (impl tokio::io::AsyncRead + Unpin + Send),
Expand Down

0 comments on commit 3f8c601

Please sign in to comment.