Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: GET /images/json: preserve original manifest order #48701

Merged
merged 1 commit into from
Oct 21, 2024

Conversation

thaJeztah
Copy link
Member

The manifests option, as used for the --tree option on docker image ls currently sorts manifests to put those that are present first. The intent was to present "available" images at the top of each tree, followed by images that were not pulled.

However, there's some limitations to this. First of all, the current approach makes the output non-deterministic as the order in which variants are pulled determines the order in which they're presented, i.e., the last pulled variant is returned first (I omitted some variants in the example for brevity);

Here's the result of pulling linux/riscv64, then pulling linux/arm64;

$ docker pull --platform=linux/riscv64 alpine:latest
$ docker image ls -a --tree

IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
alpine:latest           beefdbd8a1da       10.6MB         3.37MB
├─ linux/riscv64        80cde017a105       10.6MB         3.37MB
├─ linux/amd64          33735bd63cf8           0B             0B
└─ linux/arm64/v8       9cee2b382fe2           0B             0B


$ docker pull --platform=linux/arm64 alpine:latest
$ docker image ls -a --tree

IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
alpine:latest           beefdbd8a1da       24.2MB         7.46MB
├─ linux/riscv64        80cde017a105       10.6MB         3.37MB
├─ linux/arm64/v8       9cee2b382fe2       13.6MB         4.09MB
└─ linux/amd64          33735bd63cf8           0B             0B

Repeating the steps but in reverse order results in the output to be reversed;

$ docker image rm alpine:latest
$ docker pull --platform=linux/arm64 alpine:latest
$ docker image ls -a --tree

IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
alpine:latest           beefdbd8a1da       13.6MB         4.09MB
├─ linux/arm64/v8       9cee2b382fe2       13.6MB         4.09MB
├─ linux/amd64          33735bd63cf8           0B             0B
└─ linux/riscv64        80cde017a105           0B             0B

$ docker image ls -a --tree

IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
alpine:latest           beefdbd8a1da       24.2MB         7.46MB
├─ linux/riscv64        80cde017a105       10.6MB         3.37MB
├─ linux/arm64/v8       9cee2b382fe2       13.6MB         4.09MB
└─ linux/amd64          33735bd63cf8           0B             0B

The second limitation is that order sometimes matters; when matching a platform from a manifest-index, implementations may find multiple suitable candidates. In most cases the most suitable candidate can be selected (e.g., prefer linux/arm/v7 over linux/arm/v6), but manifest-indices do allow multiple entries for the same platform, in which case implementations match the first entry found.

While these situations will be less common (and usually due to incorect use of tooling such as docker manifest), being able to observe the order in which manifests appeared in the index can help debugging or help the user understand why a specific variant was selected.

We should therefore not re-order these manifests, and return them in the order in which they appeared. If we decide to present "present" variants before "non-present" variants, we can do this ordering on the client side.

With this patch applied;

$ docker pull --quiet --platform=linux/riscv64 alpine:latest
$ docker pull --quiet --platform=linux/arm64 alpine:latest
$ docker image ls --tree alpine

IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
alpine:latest           beefdbd8a1da       24.2MB         7.46MB
├─ linux/amd64          33735bd63cf8           0B             0B
├─ linux/arm/v6         50f635c8b04d           0B             0B
├─ linux/arm/v7         f2f82d424957           0B             0B
├─ linux/arm64/v8       9cee2b382fe2       13.6MB         4.09MB
├─ linux/386            b3e87f642f5c           0B             0B
├─ linux/ppc64le        c7a6800e3dc5           0B             0B
├─ linux/riscv64        80cde017a105       10.6MB         3.37MB
└─ linux/s390x          2b5b26e09ca2           0B             0B

Which matches the order of the manifests in the index:

$ docker buildx imagetools inspect --raw alpine:latest | jq -c .manifests[].platform
{"architecture":"amd64","os":"linux"}
{"architecture":"arm","os":"linux","variant":"v6"}
{"architecture":"arm","os":"linux","variant":"v7"}
{"architecture":"arm64","os":"linux","variant":"v8"}
{"architecture":"386","os":"linux"}
{"architecture":"ppc64le","os":"linux"}
{"architecture":"riscv64","os":"linux"}
{"architecture":"s390x","os":"linux"}

- Description for the changelog

api: `GET /images/json` with the `manifests` option enabled now preserves the original order in which manifests appeared in the manifest-index.

- A picture of a cute animal (not mandatory but encouraged)

@thaJeztah
Copy link
Member Author

Some failures that look related, but to be looked at;

=== Failed
=== FAIL: github.com/docker/docker/daemon/containerd TestImageList/one_image_with_two_platforms_is_still_one_entry (0.05s)
    image_list_test.go:250: assertion failed: amd64 (i.Manifests[0].ImageData.Platform.Architecture string) != arm64 (string)
    image_list_test.go:254: assertion failed: arm64 (i.Manifests[1].ImageData.Platform.Architecture string) != amd64 (string)

The `manifests` option, as used for the `--tree` option on `docker image ls`
currently sorts manifests to put those that are present first. The intent was
to present "available" images at the top of each tree, followed by images that
were not pulled.

However, there's some limitations to this. First of all, the current approach
makes the output non-deterministic as the order in which variants are pulled
determines the order in which they're presented, i.e., the last pulled variant
is returned first (I omitted some variants in the example for brevity);

Here's the result of pulling `linux/riscv64`, then pulling `linux/arm64`;

    docker pull --platform=linux/riscv64 alpine:latest
    docker image ls -a --tree

    IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
    alpine:latest           beefdbd8a1da       10.6MB         3.37MB
    ├─ linux/riscv64        80cde017a105       10.6MB         3.37MB
    ├─ linux/amd64          33735bd63cf8           0B             0B
    └─ linux/arm64/v8       9cee2b382fe2           0B             0B

    docker pull --platform=linux/arm64 alpine:latest
    docker image ls -a --tree

    IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
    alpine:latest           beefdbd8a1da       24.2MB         7.46MB
    ├─ linux/riscv64        80cde017a105       10.6MB         3.37MB
    ├─ linux/arm64/v8       9cee2b382fe2       13.6MB         4.09MB
    └─ linux/amd64          33735bd63cf8           0B             0B

Repeating the steps but in reverse order results in the output to be reversed;

    docker image rm alpine:latest
    docker pull --platform=linux/arm64 alpine:latest
    docker image ls -a --tree

    IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
    alpine:latest           beefdbd8a1da       13.6MB         4.09MB
    ├─ linux/arm64/v8       9cee2b382fe2       13.6MB         4.09MB
    ├─ linux/amd64          33735bd63cf8           0B             0B
    └─ linux/riscv64        80cde017a105           0B             0B

    docker image ls -a --tree

    IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
    alpine:latest           beefdbd8a1da       24.2MB         7.46MB
    ├─ linux/riscv64        80cde017a105       10.6MB         3.37MB
    ├─ linux/arm64/v8       9cee2b382fe2       13.6MB         4.09MB
    └─ linux/amd64          33735bd63cf8           0B             0B

The second limitation is that order sometimes matters; when matching a
platform from a manifest-index, implementations may find multiple suitable
candidates. In most cases the _most_ suitable candidate can be selected
(e.g., prefer `linux/arm/v7` over `linux/arm/v6`), but manifest-indices do
allow multiple entries for the same platform, in which case implementations
match the first entry found.

While these situations will be less common (and usually due to incorect use
of tooling such as `docker manifest`), being able to observe the order in
which manifests appeared in the index can help debugging or help the user
understand why a specific variant was selected.

We should therefore not re-order these manifests, and return them in the
order in which they appeared. If we decide to present "present" variants
before "non-present" variants, we can do this ordering on the client side.

With this patch applied;

    docker pull --quiet --platform=linux/riscv64 alpine:latest
    docker pull --quiet --platform=linux/arm64 alpine:latest
    docker image ls --tree alpine

    IMAGE                   ID             DISK USAGE   CONTENT SIZE   USED
    alpine:latest           beefdbd8a1da       24.2MB         7.46MB
    ├─ linux/amd64          33735bd63cf8           0B             0B
    ├─ linux/arm/v6         50f635c8b04d           0B             0B
    ├─ linux/arm/v7         f2f82d424957           0B             0B
    ├─ linux/arm64/v8       9cee2b382fe2       13.6MB         4.09MB
    ├─ linux/386            b3e87f642f5c           0B             0B
    ├─ linux/ppc64le        c7a6800e3dc5           0B             0B
    ├─ linux/riscv64        80cde017a105       10.6MB         3.37MB
    └─ linux/s390x          2b5b26e09ca2           0B             0B

Which matches the order of the manifests in the index:

    docker buildx imagetools inspect --raw alpine:latest | jq -c .manifests[].platform
    {"architecture":"amd64","os":"linux"}
    {"architecture":"arm","os":"linux","variant":"v6"}
    {"architecture":"arm","os":"linux","variant":"v7"}
    {"architecture":"arm64","os":"linux","variant":"v8"}
    {"architecture":"386","os":"linux"}
    {"architecture":"ppc64le","os":"linux"}
    {"architecture":"riscv64","os":"linux"}
    {"architecture":"s390x","os":"linux"}

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Copy link
Contributor

@vvoland vvoland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Indeed it makes more sense for the daemon API to preserve the original order.

@thaJeztah
Copy link
Member Author

Thanks! Yes, we can still discuss what makes most sense on the CLI-side. This PR actually started when I made changes on the CLI-side to change the order (sorting the architectures alphabetically), then realised we did this ordering on the daemon-side 😂

@thaJeztah thaJeztah merged commit e038410 into moby:master Oct 21, 2024
164 checks passed
@thaJeztah thaJeztah deleted the keep_manifest_order branch October 21, 2024 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

2 participants