Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: build/deploy csi-proxy in host-process container #179

Closed
wants to merge 4 commits into from

Conversation

marosset
Copy link
Contributor

@marosset marosset commented Nov 5, 2021

Signed-off-by: Mark Rossetti marosset@microsoft.com

What type of PR is this?
/kind feature

What this PR does / why we need it:
This PR adds functionality to build a container image containing csi-proxy.exe and deploy it as a daemonSet using HostProcess containers

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:
I have testing using csi-proxy deployed in a host-process container in conjunction with azurefile-csi-driver and everything is behaving as expected.

Does this PR introduce a user-facing change?:


Signed-off-by: Mark Rossetti <marosset@microsoft.com>
@k8s-ci-robot
Copy link
Contributor

@marosset: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 5, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: marosset
To complete the pull request process, please assign saad-ali after the PR has been reviewed.
You can assign the PR to them by writing /assign @saad-ali in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Hi @marosset. Thanks for your PR.

I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 5, 2021
@marosset
Copy link
Contributor Author

marosset commented Nov 5, 2021

I think this should probably get tied into https://github.com/kubernetes-csi/csi-release-tools but I need to formalize myself with that process.

image/Dockerfile Outdated Show resolved Hide resolved
if [ $(docker buildx ls | grep -c img-builder) == 0 ]; then
docker buildx create --name img-builder
fi
docker buildx use img-builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might consider cleaning up the image builder:

trap 'docker buildx rm img-builder' EXIT


export DOCKER_CLI_EXPERIMENTAL=enabled
if [ $(docker buildx ls | grep -c img-builder) == 0 ]; then
docker buildx create --name img-builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a recent update to buildx I had to begin explicitly requesting windows:

docker buildx create --name img-builder --use --platform windows/amd64

docker buildx build --platform windows/amd64 --output=type=registry --pull -f Dockerfile --build-arg BASE=mcr.microsoft.com/windows/nanoserver:1809 -t $REGISTRY/csi-proxy:$VERSION-1809 ..
docker buildx build --platform windows/amd64 --output=type=registry --pull -f Dockerfile --build-arg BASE=mcr.microsoft.com/windows/nanoserver:ltsc2022 -t $REGISTRY/csi-proxy:$VERSION-ltsc2022 ..

docker manifest create $REGISTRY/csi-proxy:$VERSION $REGISTRY/csi-proxy:$VERSION-1809 $REGISTRY/csi-proxy:$VERSION-ltsc2022
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Kubernetes test images we keep these values as separate config: https://github.com/kubernetes/kubernetes/blob/master/test/images/agnhost/BASEIMAGE

I guess we only have 2019 and WS 2022 to deal with now so this isn't an issue

docker manifest inspect $REGISTRY/csi-proxy:$VERSION
docker manifest push $REGISTRY/csi-proxy:$VERSION


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra lines

Copy link
Member

@mauriciopoppe mauriciopoppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks interesting, thanks for the PR!

I think this should probably get tied into kubernetes-csi/csi-release-tools but I need to formalize myself with that process.

yeah, we use release-tools/build.make, it already handles creating a multiarch binary too for many Windows platforms, I'd create a Dockerfile.Windows file like https://github.com/kubernetes-csi/node-driver-registrar/blob/master/Dockerfile.Windows and modify the root Makefile to call push-multiarch instead

hostNetwork: true
containers:
- name: csi-proxy
image: mrosse3/csi-proxy:v1.0.2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be k8s.gcr.io/sig-storage/csi-proxy before the merge

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack!

@mauriciopoppe
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 5, 2021
Copy link
Contributor

@ddebroy ddebroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! It would be nice to align with release-tools and a structure like https://github.com/kubernetes-csi/csi-driver-smb/tree/master/deploy .

kind: DaemonSet
metadata:
name: csi-proxy
namespace: kube-system # Is this the right namespace?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deploy/csi-proxy-ds.yaml Outdated Show resolved Hide resolved
@marosset
Copy link
Contributor Author

marosset commented Nov 8, 2021

Thanks all, let me look into aligning with the release-tools.

@marosset
Copy link
Contributor Author

marosset commented Nov 8, 2021

@mauriciopoppe I made some updates and ran make push-multiarch PULL_BASE_REF=master REGISTRY_NAME=mrosse3 BUILD_PLATFORMS="windows amd64 .exe nanoserver:1809; windows amd64 .exe nanoserver:ltsc2022".

The image it produced looks good - https://hub.docker.com/layers/mrosse3/csi-proxy/canary/images/sha256-e085d3942a3bef9c9f4ffcd6f534bc75f1d18fcca4cec3dc3093c7369bc7ab7f?context=explore

Is there anything else I need to do to integrate w/ the release tools?
This was a lot more straight-forward than I was expecting!

@mauriciopoppe
Copy link
Member

@marosset I've been reading the scripts related with our build process, we have two:

I think the overall objective is to call the function gcr_cloud_build declared in release-tools/prow.sh, that function takes care of building a multiarch image, however we shouldn't break the code that generates the standalone binary too as this is another release artifact.

For the presubmit job the file used is .prow.sh which is sourcing release-tools/prow.sh and calling run_with_go, I think that .prow.sh should use the function gcr_cloud_build instead of the two run_with_go calls, gcr_cloud_build calls run_with_go under the hood.

For the postsubmit job the file used is cloudbuild.yaml which is calling .cloudbuild.sh, in other CSI repos the file .cloudbuild.sh is a symlink to release-tools/cloudbuild.sh which is calling gcr_cloud_build too, we might make it a symlink or call the same functions as release-tools/cloudbuild.sh.

@marosset
Copy link
Contributor Author

@marosset I've been reading the scripts related with our build process, we have two:

I think the overall objective is to call the function gcr_cloud_build declared in release-tools/prow.sh, that function takes care of building a multiarch image, however we shouldn't break the code that generates the standalone binary too as this is another release artifact.

For the presubmit job the file used is .prow.sh which is sourcing release-tools/prow.sh and calling run_with_go, I think that .prow.sh should use the function gcr_cloud_build instead of the two run_with_go calls, gcr_cloud_build calls run_with_go under the hood.

For the postsubmit job the file used is cloudbuild.yaml which is calling .cloudbuild.sh, in other CSI repos the file .cloudbuild.sh is a symlink to release-tools/cloudbuild.sh which is calling gcr_cloud_build too, we might make it a symlink or call the same functions as release-tools/cloudbuild.sh.

Thanks @mauriciopoppe this is very helpful.
Let me make these updates.

Signed-off-by: Mark Rossetti <marosset@microsoft.com>
.prow.sh Outdated
@@ -14,3 +14,6 @@ ensure_paths
# main
run_with_go "${CSI_PROW_GO_VERSION_BUILD}" make all "GOFLAGS_VENDOR=${GOFLAGS_VENDOR}" "BUILD_PLATFORMS=${CSI_PROW_BUILD_PLATFORMS}"
run_with_go "${CSI_PROW_GO_VERSION_BUILD}" make -k test "GOFLAGS_VENDOR=${GOFLAGS_VENDOR}" 2>&1 | make_test_to_junit

# build / push multi-arch images for validation
gcr_cloud_build
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gcr_cloud_build just calls make push-multiarch so i left in the calls to run_with_go make all and make test.
Should I update the push-multiarch make target to run the tests too so this file can just call gcr_cloud_build and not the run_with_go commands?
That seems a little convoluted to me.

@k8s-ci-robot
Copy link
Contributor

@marosset: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-csi-csi-proxy 498fe4f link true /test pull-kubernetes-csi-csi-proxy

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@marosset
Copy link
Contributor Author

@mauriciopoppe is gcr.io/k8s-staging-sig-storage the correct registry to use for presubmit jobs?
When I didn't specify the registry explicitly in .prow.sh the envvar wasn't set in the gcr_cloud_build call.

I'm getting the following errors

#9 [auth] k8s-staging-sig-storage/csi-proxy:pull,push token for gcr.io
#9 DONE 0.0s

#8 exporting to image
#8 pushing layers 2.2s done
#8 ERROR: failed to authorize: failed to fetch oauth token: unexpected status: 403 Forbidden
------
 > exporting to image:
------
error: failed to solve: failed to fetch oauth token: unexpected status: 403 Forbidden
+ docker buildx rm multiarchimage-buildertest
make: *** [release-tools/build.make:149: push-multiarch-csi-proxy] Error 1
+ EXIT_VALUE=2
+ set +o xtrace

# Extract tag-n-hash value from GIT_TAG (form vYYYYMMDD-tag-n-hash) for REV value.
REV=v$(echo "$GIT_TAG" | cut -f3- -d 'v')
: ${CSI_PROW_BUILD_PLATFORMS:="windows amd64 .exe nanoserver:1809; windows amd64 .exe nanoserver:ltsc2022"}
: ${REGISTRY_NAME:="gcr.io/k8s-staging-sig-storage"}
Copy link
Member

@mauriciopoppe mauriciopoppe Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marosset so sorry for the late response. I checked what other CSI projects do and none of them push the presubmit artifacts to a remote! They instead just build the linux image locally and run a kind cluster on the location where the prow job runs for the integration tests.

I recently added a multidistro build to kubernetes-sigs/local-volume-provisioner in https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner/pull/279/files, every prow job inherits the project through the env var $PROJECT which can be used for the presubmit image location.

One thing that I saw that might cause problems is that the tag is always canary for master, I think it should have some info about the git tag to avoid stepping on the changes of another contributor but we can fix that later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might make the presubmit job pass but I think I might be forgetting about the post submit image push to gcr.io/k8s-staging-sig-storage, I'll check that too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

: ${REGISTRY_NAME:="gcr.io/k8s-staging-sig-storage"}

shouldn't be set in this file, this is the file that runs on postsubmit

@@ -4,7 +4,8 @@

# # Only these tests make sense for csi-proxy
: ${CSI_PROW_TESTS:="unit"}
: ${CSI_PROW_BUILD_PLATFORMS:="windows amd64 .exe"}
: ${CSI_PROW_BUILD_PLATFORMS:="windows amd64 .exe nanoserver:1809"}
: ${REGISTRY_NAME:="gcr.io/k8s-staging-sig-storage"}
Copy link
Member

@mauriciopoppe mauriciopoppe Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be:

: ${REGISTRY_NAME:="gcr.io/$PROJECT"}

@@ -4,7 +4,8 @@

# # Only these tests make sense for csi-proxy
: ${CSI_PROW_TESTS:="unit"}
: ${CSI_PROW_BUILD_PLATFORMS:="windows amd64 .exe"}
: ${CSI_PROW_BUILD_PLATFORMS:="windows amd64 .exe nanoserver:1809"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSI_PROW_BUILD_PLATFORMS should already have a default, I think we should remove this line and use that default instead

@@ -14,3 +15,6 @@ ensure_paths
# main
run_with_go "${CSI_PROW_GO_VERSION_BUILD}" make all "GOFLAGS_VENDOR=${GOFLAGS_VENDOR}" "BUILD_PLATFORMS=${CSI_PROW_BUILD_PLATFORMS}"
run_with_go "${CSI_PROW_GO_VERSION_BUILD}" make -k test "GOFLAGS_VENDOR=${GOFLAGS_VENDOR}" 2>&1 | make_test_to_junit

# build / push multi-arch images for validation
gcr_cloud_build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for clarification, this file .prow.sh is what runs in the presubmit jobs and .cloudbuild.sh is the one that runs on postsubmit jobs

@mauriciopoppe
Copy link
Member

@ddebroy, @jingxu97 and I talked about supporting this mode, my opinion is:

  • having the binary running in a container makes its deployment easier than what it's now, CSI Drivers are in control of creating a bundle with CSI Proxy too and there's no need to do the node bootstrapping downloading the binary and running it as a Windows service.
  • supporting a new artifact (the image) is harder, every time we create a new release we also have to promote the staging image to the k8s.gcr.io release project, the issue here is that if we push an image we have to support it for the foreseeable future.
  • this PR is a step in between but might not be needed, HostProcess container is already beta in 1.23, I'd put more effort in modifying this repo to become a library and instead run the CSI Drivers/Addons with the HostProcess container setup.

@marosset
Copy link
Contributor Author

marosset commented Dec 1, 2021

@ddebroy, @jingxu97 and I talked about supporting this mode, my opinion is:

  • having the binary running in a container makes its deployment easier than what it's now, CSI Drivers are in control of creating a bundle with CSI Proxy too and there's no need to do the node bootstrapping downloading the binary and running it as a Windows service.
  • supporting a new artifact (the image) is harder, every time we create a new release we also have to promote the staging image to the k8s.gcr.io release project, the issue here is that if we push an image we have to support it for the foreseeable future.
  • this PR is a step in between but might not be needed, HostProcess container is already beta in 1.23, I'd put more effort in modifying this repo to become a library and instead run the CSI Drivers/Addons with the HostProcess container setup.

I've been out of the office for a bit but am back now.
This makes sense. If we go the library route then we'd need to update many projects - correct?

@mauriciopoppe
Copy link
Member

This makes sense. If we go the library route then we'd need to update many projects - correct?

Correct, we would:

  • update this codebase so that the client library that it exposes doesn't use gRPC and instead just uses the server methods
  • update CSI Drivers and addons to use CSI Proxy as a library

There's a doc that @ddebroy started but I haven't had a chance to complete yet: https://docs.google.com/document/d/15lgXyWytWSg1PuEcHl72eqgLoXc_i8kXmQcrXhRpjO0/edit#heading=h.6vn5bqze2ffs

@marosset
Copy link
Contributor Author

/close
I'm moving this work to github.com/kubernetes-sigs/sig-windows-tools with #179

@k8s-ci-robot
Copy link
Contributor

@marosset: Closed this PR.

In response to this:

/close
I'm moving this work to github.com/kubernetes-sigs/sig-windows-tools with #179

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants