Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kbs: Remove oci.kbs version and references #2077

Conversation

stevenhorsman
Copy link
Member

Our KBS logic currently checks out the git.kbs version of code, but then edits the deployment config to override the image based on the oci.kbs version.

This means that there is an assumption that the oci image of the KBS is compatible with the kustomization, which might not not always be the case and in the "always safe" case where the image is built from that exact version of code (e.g. releases), just means that we have to specify it in two places, so there isn't much advantage to this approach.

This PR removes the oci.kbs image and versioning to avoid this potential incompatibility and simplify trustee version updates.

Fixes: #2076

@stevenhorsman stevenhorsman requested a review from a team as a code owner October 2, 2024 12:47
@stevenhorsman stevenhorsman added the test_e2e_libvirt Run Libvirt e2e tests label Oct 2, 2024
@stevenhorsman stevenhorsman changed the title all: Remove oci.kbs version and references kbs: Remove oci.kbs version and references Oct 2, 2024
Copy link
Collaborator

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕳️🏃‍♂️

@wainersm
Copy link
Member

wainersm commented Oct 2, 2024

Hi @stevenhorsman !

I just recently I tried e2e tests with KBS and I ended up with those variables copied in many places and exported, and I didn't even know if it was all needed. So this change simplifies a lot!

The downside is that we cannot point to an arbitrary SHA-1 of trustee's repo. How likely are we of hitting that case?

There is the case of trustee releasing an outdated kustomization file, for example, https://github.com/confidential-containers/trustee/blob/v0.10.0/kbs/config/kubernetes/base/kustomization.yaml#L8 . Well, in this case we would probably ask for another release....the problem is maybe the time we take to realize the file is not synchronized.

Also perhaps drop the pinning mechanism on kata side too, so that if we test with the wrong image here then we it test wrong there too (increasing the odds of finding the problem?).

@stevenhorsman
Copy link
Member Author

The downside is that we cannot point to an arbitrary SHA-1 of trustee's repo. How likely are we of hitting that case?

I think we should be able to point to a sha with the git.kbs.reference like we had in

reference: e890fc90c384207668fa3a4d6a2f2a2d652797ee
, or have I misunderstood your point?

There is the case of trustee releasing an outdated kustomization file, for example, https://github.com/confidential-containers/trustee/blob/v0.10.0/kbs/config/kubernetes/base/kustomization.yaml#L8 . Well, in this case we would probably ask for another release....the problem is maybe the time we take to realize the file is not synchronized.

Yeah, that is a really good point, but it looks like that not being bumped in the v0.10.0 was the reason for the v0.10.1 release: https://github.com/confidential-containers/trustee/releases/tag/v0.10.1, so hopefully it's already included in their release process

Also perhaps drop the pinning mechanism on kata side too, so that if we test with the wrong image here then we it test wrong there too (increasing the odds of finding the problem?).

Yeah - I think that makes sense. I'll create an issue for this.

@stevenhorsman
Copy link
Member Author

I've created kata-containers/kata-containers#10379 for the kata side of this and as part of that checked with Ding that it's a fair expectation for them to keep their images pinned and "up-to-date"

@mkulke
Copy link
Collaborator

mkulke commented Oct 3, 2024

The downside is that we cannot point to an arbitrary SHA-1 of trustee's repo. How likely are we of hitting that case?

I think that's not unlikely, but then it would be handled like the coco operator dependency: you cannot pick a given kbs revision directly, you'll get whatever is the default in the kbs kustomize manifest at git.trustee.reference. for in-between-release commits that might be :latest

If a developer wants to deploy a specific oci image and also use the provisioner for their test setup, then they have to fork the trustee repo, hardcode their oci image + tag and use that fork in git.trustee.url/reference in versions.yaml.

@stevenhorsman
Copy link
Member Author

Tobin made the comment on the kata issue: kata-containers/kata-containers#10379 (comment) that suggests maybe we do need set the kustomize image tag, but we could read it from the git reference version. That would be re-adding some of this logic back to the e2e test in I think though, or having a pre-step that does the kustomize edit set image at the same time as we clone the repo?

@mkulke
Copy link
Collaborator

mkulke commented Oct 3, 2024

Tobin made the comment on the kata issue: kata-containers/kata-containers#10379 (comment) that suggests maybe we do need set the kustomize image tag, but we could read it from the git reference version. That would be re-adding some of this logic back to the e2e test in I think though, or having a pre-step that does the kustomize edit set image at the same time as we clone the repo?

oof. maybe that could be done as part of the trustee clone step in the CI?

@stevenhorsman
Copy link
Member Author

oof. maybe that could be done as part of the trustee clone step in the CI?

Yeah, that was my thinking.

@stevenhorsman
Copy link
Member Author

oof. maybe that could be done as part of the trustee clone step in the CI?

I've added

pushd kbs/config/kubernetes/base/
kustomize edit set image kbs-container-image=*:${KBS_VERSION}
popd

after the clone, which should hopefully do the trick

@stevenhorsman stevenhorsman marked this pull request as draft October 3, 2024 17:04
@stevenhorsman
Copy link
Member Author

I've added

pushd kbs/config/kubernetes/base/
kustomize edit set image kbs-container-image=*:${KBS_VERSION}
popd

after the clone, which should hopefully do the trick

I've already spotted an issue with this:

https://github.com/confidential-containers/trustee/pkgs/container/key-broker-service/versions?filters%5Bversion_type%5D=tagged only contains released versions
and ghcr.io/confidential-containers/staged-images/kbs has the commit version, so we might need to do some logic and switch the image as well?

@mkulke
Copy link
Collaborator

mkulke commented Oct 3, 2024

I've already spotted an issue with this:

https://github.com/confidential-containers/trustee/pkgs/container/key-broker-service/versions?filters%5Bversion_type%5D=tagged only contains released versions and ghcr.io/confidential-containers/staged-images/kbs has the commit version, so we might need to do some logic and switch the image as well?

we could use this to resolve releases to a sha:

gh api repos/confidential-containers/trustee/commits/v0.10.1 -q .sha
68607d4300dda5a8ae948e2562fd06d09cbd7eca

@stevenhorsman
Copy link
Member Author

we could use this to resolve releases to a sha:

gh api repos/confidential-containers/trustee/commits/v0.10.1 -q .sha
68607d4300dda5a8ae948e2562fd06d09cbd7eca

Yeah, I guess we can do that and always use the staged-image, or alternative do a KBS_VERSION pattern match for vx.y.z and then use the release image in that circumstance?

@stevenhorsman
Copy link
Member Author

we could use this to resolve releases to a sha:

gh api repos/confidential-containers/trustee/commits/v0.10.1 -q .sha
68607d4300dda5a8ae948e2562fd06d09cbd7eca

I've implemented this as a separate commit for ease of review, once people are happy then I'll squash it into the previous commit before merging.

@stevenhorsman stevenhorsman marked this pull request as ready for review October 7, 2024 14:53
# Trustee only updates their staging image reliably with sha tags,
# so switch to use that and convert the version to the sha
KBS_SHA=$(gh api repos/confidential-containers/trustee/commits/${KBS_VERSION} -q .sha)
kustomize edit set image kbs-container-image=ghcr.io/confidential-containers/staged-images/kbs:${KBS_SHA}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @stevenhorsman !

Allow me to ask you 2 changes here:

  1. cat the kustomization file, for debugging sake
  2. Grep for ghcr.io/confidential-containers/staged-images/kbs and ${KBS_SHA}. In some places in kata we have this safeguard, in others not. The problem is that kustomize may silently not set the image (e.g. in case kbs-container-image change), and usually it takes ages until it's realized we were using a wrong image.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For part 2 do you mean in the workflow output:

$ cat kustomization.yaml | grep -A 3 images:
images:
- name: kbs-container-image
  newName: ghcr.io/confidential-containers/key-broker-service
  newTag: test

for debug, or something else?

@wainersm
Copy link
Member

wainersm commented Oct 7, 2024

I've mixed feelings about this piece of code. In one side I'm happy seeing those confusing kbs properties going away from the e2e tests framework, but on the flip side I don't like the half-in-e2e-half-manual setup required to run the KBS based tests as we end up duplicating stuffs on workflows as well as making the execution on dev workstation more obscure.

Anyway, that's going to the right direction IMO.

Copy link
Member

@wainersm wainersm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @stevenhorsman !

Our KBS logic currently checks out the git.kbs version of code, but then edits
the deployment config to override the image based on the oci.kbs version.

This means that there is an assumption that the oci image
of the KBS is compatible with the kustomization, which might not
not always be the case and in the "always safe" case where the
image is built from that exact version of code (e.g. releases), just
means that we have to specify it in two places, so there isn't much advantage
to this approach.

This PR removes the oci.kbs image and versioning to avoid this
potential incompatibility and simplify trustee version updates.

Because there is a chance that the trustee team might not keep
their kustomizational image up-to-date, the recommendation is
that we should ensure that the image version matches the branch
we are using, so perform a kustomize set image at the time we clone
the repo.

As trustee have a separate container registry for development
versus release images for reasons, they only automatically push
builds to the staged-images registry, which are tagged by the sha
of the commit, so we need to convert the KBS_VERSION given
into the sha and then replace the image with the staged-images kbs

Fixes: confidential-containers#2076
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
@stevenhorsman
Copy link
Member Author

I've mixed feelings about this piece of code. In one side I'm happy seeing those confusing kbs properties going away from the e2e tests framework, but on the flip side I don't like the half-in-e2e-half-manual setup required to run the KBS based tests as we end up duplicating stuffs on workflows as well as making the execution on dev workstation more obscure.

Anyway, that's going to the right direction IMO.

Agreed, maybe we should look at making a script for the trustee checkout commands and then call this in the e2e framework if KBS deploy is set.

@stevenhorsman stevenhorsman merged commit 3751949 into confidential-containers:main Oct 9, 2024
28 checks passed
@stevenhorsman stevenhorsman deleted the Remove-kbs-image-kustomization branch October 9, 2024 16:00
stevenhorsman added a commit to stevenhorsman/cloud-api-adaptor that referenced this pull request Oct 10, 2024
In confidential-containers#2077 I forgot about pull_request_target not testing
workflows in PRs and that self-hosted runners wouldn't
have the gh cli installed, so this attempts to fix the workflows
and install gh cli and login. From local testing I think that's enough
as `gh api` doesn't require the repo to be passed in, but if we want
to expand the gh commands we use in future we might need
additional envs set.

Note: I did find a few actions to do this, but they were only used by single digit
projects, so I figured it would be safer to do it ourselves for now

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
stevenhorsman added a commit to stevenhorsman/cloud-api-adaptor that referenced this pull request Oct 10, 2024
In confidential-containers#2077 I forgot about pull_request_target not testing
workflows in PRs and that self-hosted runners wouldn't
have the gh cli installed, so this attempts to fix the workflows
and install gh cli first.

I also set `GH_TOKEN` env in the step that uses `gh` based on the
`gh help environment` doc that states:
GH_TOKEN: an authentication token for github.com API requests.
Setting this avoids being prompted to authenticate.

Note: I did find a few actions to do this, but they were only used by single digit
projects, so I figured it would be safer to do it ourselves for now

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
stevenhorsman added a commit to stevenhorsman/cloud-api-adaptor that referenced this pull request Oct 10, 2024
In confidential-containers#2077 I forgot about pull_request_target not testing
workflows in PRs and that self-hosted runners wouldn't
have the gh cli installed, so this attempts to fix the workflows
and install gh cli first.

I also set `GH_TOKEN` env in the step that uses `gh` based on the
`gh help environment` doc that states:
GH_TOKEN: an authentication token for github.com API requests.
Setting this avoids being prompted to authenticate.

Note: I did find a few actions to do this, but they were only used by single digit
projects, so I figured it would be safer to do it ourselves for now

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
wainersm pushed a commit that referenced this pull request Oct 10, 2024
In #2077 I forgot about pull_request_target not testing
workflows in PRs and that self-hosted runners wouldn't
have the gh cli installed, so this attempts to fix the workflows
and install gh cli first.

I also set `GH_TOKEN` env in the step that uses `gh` based on the
`gh help environment` doc that states:
GH_TOKEN: an authentication token for github.com API requests.
Setting this avoids being prompted to authenticate.

Note: I did find a few actions to do this, but they were only used by single digit
projects, so I figured it would be safer to do it ourselves for now

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
davidhadas pushed a commit to davidhadas/cloud-api-adaptor that referenced this pull request Oct 23, 2024
In confidential-containers#2077 I forgot about pull_request_target not testing
workflows in PRs and that self-hosted runners wouldn't
have the gh cli installed, so this attempts to fix the workflows
and install gh cli first.

I also set `GH_TOKEN` env in the step that uses `gh` based on the
`gh help environment` doc that states:
GH_TOKEN: an authentication token for github.com API requests.
Setting this avoids being prompted to authenticate.

Note: I did find a few actions to do this, but they were only used by single digit
projects, so I figured it would be safer to do it ourselves for now

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
davidhadas pushed a commit to davidhadas/cloud-api-adaptor that referenced this pull request Oct 24, 2024
In confidential-containers#2077 I forgot about pull_request_target not testing
workflows in PRs and that self-hosted runners wouldn't
have the gh cli installed, so this attempts to fix the workflows
and install gh cli first.

I also set `GH_TOKEN` env in the step that uses `gh` based on the
`gh help environment` doc that states:
GH_TOKEN: an authentication token for github.com API requests.
Setting this avoids being prompted to authenticate.

Note: I did find a few actions to do this, but they were only used by single digit
projects, so I figured it would be safer to do it ourselves for now

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test_e2e_libvirt Run Libvirt e2e tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI: Remove KBS_IMAGE option
3 participants