-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Centralize cleanup of created resources #261
Centralize cleanup of created resources #261
Conversation
Hi @timoreimann. Thanks for your PR. I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this a lot. Much cleaner (no pun intended, or did I?) cleanup code...
The csi-test
repo is also a Go module that others may import and call directly (we do that in PMEM-CSI), so this is a breaking API change which must be dealt with accordingly:
- announce it in the release note
- bump the version of the repo to v4.0.0, which implies updating the import paths
If this change and the rename proposal finds consensus, I'm happy to carry it out either through another commit or a follow-up PR.
I think we should do that in a separate commit.
pkg/sanity/cleanup.go
Outdated
NodeClient csi.NodeClient | ||
Context *TestContext | ||
// ControllerClient is meant for struct-internal use only | ||
csi.ControllerClient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's meant for that, but we don't enforce that because the embedded member is getting exported, right?
I'm on the edge whether the API should prevent access by making it an unexpected member (controllerClient csi.ControllerClient
). There may be valid cases where a user may want to call the methods that aren't wrapped.
I think I prefer keeping it like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a comment only because unexporting wouldn't help: the (repo-internal) consumers are all part of the same sanity
package, so they'd still be able to access controllerClient
. It'd have to be moved into a sub-package, but I didn't want to go that far in my PR.
I can't think of too many reasons why users wouldn't want to go through the Cleanup
layer: unless csi-test was buggy, it should do the right thing. The one case I can think of is when the Cleanup()
part shouldn't be executed. Maybe that's reasonable, so let's stick to keeping it exported. I updated the comment to make the implications more clear.
pkg/sanity/cleanup.go
Outdated
}, | ||
); err != nil { | ||
logger.Printf("warning: NodeUnstageVolume: %s", err) | ||
if status.Code(err) != codes.NotFound { | ||
Fail(fmt.Sprintf("NodeUnpublishVolume failed: %s", err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't Fail
during cleanup while there is still work to do. Instead the code should try to execute all operations, log and/or collect failures, and then in the end fail the test.
Otherwise cleaning up stops early although some other volumes perhaps could be deleted successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I created a small logger wrapper that tries to simplify the task.
pkg/sanity/cleanup.go
Outdated
// successfully created. | ||
func (cl *Cleanup) MustCreateSnapshot(ctx context.Context, req *csi.CreateSnapshotRequest) *csi.CreateSnapshotResponse { | ||
snap, err := cl.createSnapshot(ctx, req) | ||
Expect(err).NotTo(HaveOccurred()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A problem with assertions in helper functions is that Ginkgo only reports one source line for the error by default; even with -trace
, that initial line is typically not very informative.
Better use ExpectWithOffset
and add an offset parameter to the MustCreateSnapshot
parameters so that this helper function an also be called indirectly through some other helper functions.
Also, NotTo(HaveOccurred())
without additional explanation is potentially problematic, depending on how much information is in the error. Much too often the error is very generic, in which case the assertion produced by Gomega doesn't say anything about what failed.
Better always use NoTo(HaveOccurred(), "create snapshot"
, potentially even with further parameters.
I know, much of the existing code doesn't do that properly either 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I remember you previously mentioned how To()
should have a description but forgot about it again. I made sure it's now set.
I also updated the code to specify and/or pass through the offset everywhere. It's not a beauty though, I wonder if ginkgo could do better here by deriving the offset automatically (at least after the top-level t.Helper()
-like indicator).
92973ec
to
9cb04b3
Compare
@pohly all comments addressed, except for the two regarding the breaking change because I have one more dependent question: I might not be fully familiar with how consumers of The reason I'm asking is that I'm wondering if we should take advantage of the breaking change and start hiding parts of the package under an |
Yes, in PMEM-CSI we do have custom tests that are built on top of the sanity infrastructure, in addition to running the pre-defined tests: https://github.com/intel/pmem-csi/blob/018313154dff214da21fe39e6902d87857bc26e8/test/e2e/storage/sanity.go#L191-L230
Please don't 😅 |
@pohly alrighty, I added a release note and moved v3 to v4 while also updating the links. Let me know if I missed something. |
Would love to see this getting approved and merged soonish because it touches a fair amount of existing tests, so other merges happening meanwhile stand a fair chance of generating merge conflicts. |
/kind api-change |
/ok-to-test |
pkg/sanity/logger.go
Outdated
|
||
// Assert fails the spec if any error was logged. | ||
func (l *logger) Assert(offset int) { | ||
ExpectWithOffset(offset+1, l.failed).To(BeFalse()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directly calling https://pkg.go.dev/github.com/onsi/ginkgo?tab=doc#Fail with a caller skip parameter and a suitable message is probably going to look better in the resulting test failure.
If you want to make the failure message more informative, count errors and include the count here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer logger.Assert()
looks odd. Assert - assert what?
Perhaps logger.CheckForErrors()
or (similar to framework.ExpectNoError
) logger.ExpectNoErrors()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented both.
45af82a
to
e4952d3
Compare
I rebased once more. Also noticed I missed some changes needed for the v3->v4 transition, so updated that as well. It seems like I had to run |
I think go.mod specifies what we are compatible with. However, in practice this isn't getting tested: we only test with the Go version specified in csi-test/release-tools/travis.yml Line 9 in e89bc15
As long as Go 1.12 and 1.13 produce the same output, that doesn't matter. It worked here, so this is FYIO. However, I have seen cases where it didn't work and the pre-merge check with Go 1.13 complained. This is true for all Kubernetes-CSI repos. I wonder whether we should:
The latter has the problem that it prevents downstream users from using an older Go even when that would still technically work. This is only an issue for repos that may get imported by others as a dependency (csi-test, csi-lib-utils). @msau42: any thoughts on this? |
@timoreimann please rebase. |
e4952d3
to
138b631
Compare
@pohly rebased. I also verified that 1.13 does not differ with regards to |
d7256f6
to
25e66ce
Compare
36e1977
to
2e872cb
Compare
2e872cb
to
357a804
Compare
@pohly I figured out why the tests were failing: two I pushed a fixing commit and rebased from master. From my point of view, the PR is now good to move on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: pohly, timoreimann The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[1] accidentally swapped the cleanup order which represents a deviation to the previous behavior that not all CSI drivers may be able to handle. This change restores the original order. [1]: kubernetes-csi#261
This addresses a regression in [1] causing plugins to return an in-use error (FAILED_PRECONDITION) when a sourcing resource (i.e., a snapshot or a volume) is deleted before the sourced volume is. [1]: kubernetes-csi#261
This addresses a regression in [1] causing plugins to return an in-use error (FAILED_PRECONDITION) when a sourcing resource (i.e., a snapshot or a volume) is deleted before the sourced volume is. [1]: kubernetes-csi#261
734c2b9 Merge pull request kubernetes-csi#265 from Rakshith-R/consider-main-branch f95c855 Merge pull request kubernetes-csi#262 from huww98/golang-toolchain 3c8d966 Treat main branch as equivalent to master branch e31de52 Merge pull request kubernetes-csi#261 from huww98/golang fd153a9 Bump golang to 1.23.1 a8b3d05 pull-test.sh: fix "git subtree pull" errors 6b05f0f use new GOTOOLCHAIN env to manage go version 227577e Merge pull request kubernetes-csi#258 from gnufied/enable-race-detection e1ceee2 Always enable race detection while running tests 988496a Merge pull request kubernetes-csi#257 from jakobmoellerdev/csi-prow-sidecar-e2e-path 028f8c6 chore: bump to Go 1.22.5 69bd71e chore: add CSI_PROW_SIDECAR_E2E_PATH f40f0cc Merge pull request kubernetes-csi#256 from solumath/master cfa9210 Instruction update 379a1bb Merge pull request kubernetes-csi#255 from humblec/sidecar-md a5667bb fix typo in sidecar release process 4967685 Merge pull request kubernetes-csi#254 from bells17/add-github-actions d9bd160 Update skip list in codespell GitHub Action adb3af9 Merge pull request kubernetes-csi#252 from bells17/update-go-version f5aebfc Add GitHub Actions workflows b82ee38 Merge pull request kubernetes-csi#253 from bells17/fix-typo c317456 Fix typo 0a78505 Bump to Go 1.22.3 edd89ad Merge pull request kubernetes-csi#251 from jsafrane/add-logcheck 043fd09 Add test-logcheck target d7535ae Merge pull request kubernetes-csi#250 from jsafrane/go-1.22 b52e7ad Update go to 1.22.2 14fdb6f Merge pull request kubernetes-csi#247 from msau42/prow dc4d0ae Merge pull request kubernetes-csi#249 from jsafrane/use-go-version e681b17 Use .go-version to get Kubernetes go version 9b4352e Update release playbook c7bb972 Fix release notes script to use fixed tags 463a0e9 Add script to update specific go modules b54c1ba Merge pull request kubernetes-csi#246 from xing-yang/go_1.21 5436c81 Change go version to 1.21.5 267b40e Merge pull request kubernetes-csi#244 from carlory/sig-storage b42e5a2 nominate self (carlory) as kubernetes-csi reviewer a17f536 Merge pull request kubernetes-csi#210 from sunnylovestiramisu/sidecar 011033d Use set -x instead of die 5deaf66 Add wrapper script for sidecar release f8c8cc4 Merge pull request kubernetes-csi#237 from msau42/prow b36b5bf Merge pull request kubernetes-csi#240 from dannawang0221/upgrade-go-version adfddcc Merge pull request kubernetes-csi#243 from pohly/git-subtree-pull-fix c465088 pull-test.sh: avoid "git subtree pull" error 7b175a1 Update csi-test version to v5.2.0 987c90c Update go version to 1.21 to match k/k 2c625d4 Add script to generate patch release notes f9d5b9c Merge pull request kubernetes-csi#236 from mowangdk/feature/bump_csi-driver-host-path_version b01fd53 Bump csi-driver-host-path version up to v1.12.0 984feec Merge pull request kubernetes-csi#234 from siddhikhapare/csi-tools 1f7e605 fixed broken links of testgrid dashboard de2fba8 Merge pull request kubernetes-csi#233 from andyzhangx/andyzhangx-patch-1 cee895e remove windows 20H2 build since it's EOL long time ago 670bb0e Merge pull request kubernetes-csi#229 from marosset/fix-codespell-errors 35d5e78 Merge pull request kubernetes-csi#219 from yashsingh74/update-registry 63473cc Merge pull request kubernetes-csi#231 from coulof/bump-go-version-1.20.5 29a5c76 Merge pull request kubernetes-csi#228 from mowangdk/chore/adopt_kubernetes_recommand_labels 8dd2821 Update cloudbuild image with go 1.20.5 1df23db Merge pull request kubernetes-csi#230 from msau42/prow 1f92b7e Add ginkgo timeout to e2e tests to help catch any stuck tests 2b8b80e fixing some codespell errors c10b678 Merge pull request kubernetes-csi#227 from coulof/check-sidecar-supported-versions 72984ec chore: adopt kubernetes recommand label b055535 Header bd0a10b typo c39d73c Add comments f6491af Script to verify EOL sidecar version 4133d1d Merge pull request kubernetes-csi#226 from msau42/cloudbuild 8d519d2 Pin buildkit to v0.10.6 to workaround v0.11 bug with docker manifest 6e04a03 Merge pull request kubernetes-csi#224 from msau42/cloudbuild 26fdfff Update cloudbuild image 6613c39 Merge pull request kubernetes-csi#223 from sunnylovestiramisu/update 0e7ae99 Update k8s image repo url 77e47cc Merge pull request kubernetes-csi#222 from xinydev/fix-dep-version 155854b Fix dep version mismatch 8f83905 Merge pull request kubernetes-csi#221 from sunnylovestiramisu/go-update 1d3f94d Update go version to 1.20 to match k/k v1.27 e322ce5 Merge pull request kubernetes-csi#220 from andyzhangx/fix-golint-error b74a512 test: fix golint error 901bcb5 Update registry k8s.gcr.io -> registry.k8s.io git-subtree-dir: release-tools git-subtree-split: 734c2b950c4b31f64b63052c64ffa5929d1c9b97
What type of PR is this?
What this PR does / why we need it:
This change revamps the way resources (like volumes and now also snapshots) are managed in tests with regards to cleaning up. Instead of putting the onus of cleaning up on the test author, we extend
Cleanup
to automatically (un-)register resources as they are being used.Cleanup
now exposes a single API that implements bothControllerClient
andNodeClient
to make it easier for all garbage collection-worthy requests to be funnelled through the new API. The way this is implemented inCleanup
is by embedding bothControllerClient
andNodeClient
, and proxying to the actual methods before registering cleanup tasks and returning the results.Consequently, we can throw away large chunks of cleanup test code and unify all
{Controller,Node}Client
access to theCleanup
variable. In essence, this makes it much easier to do the right thing as a test author since each existingDescribe
context will provide a singleinteraction point to the CSI APIs only.
For frequently used resource creation operations, we also provide
Must*
equivalents that fail the test if the results are unexpected. This makes our test code even more streamlined by DRYing out the number of assertions called.List of other changes:
cleanup.go
:DeleteVolumes
toCleanup
.MustCreateSnapshotFromVolumeRequest
to create a sourcing volume and a snapshot in one go.controller.go
,node.go
:Cleanup
only. (That is, do not offerControllerClient
andNodeClient
directly anymore.)Cleanup.Cleanup
inAfterEach
where missing.Cleanup
.Must*
equivalents were applicable.HaveLen
to simplify length assertions.Cleanup
variable initialization consistent.Rename
Cleanup
toResources
and the file name accordingly.Which issue(s) this PR fixes:
Fixes #260
Special notes for your reviewer:
(As agreed on during the review, this PR now also does the rename.)Cleanup
probably deserves a more generic name as this point, likeResources
. I hesitated from renaming the variable (and the hosting file name) though to ease diffing the change. If this change and the rename proposal finds consensus, I'm happy to carry it out either through another commit or a follow-up PR.Does this PR introduce a user-facing change?:
/cc @pohly