Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw fatal error and ask user to run cleanup when a k8s resource already exists in the cluster #135

Merged
merged 5 commits into from
Jan 29, 2024

Conversation

reetasingh
Copy link
Contributor

@reetasingh reetasingh commented Jan 27, 2024

When k8s resource like Namespace, clusterrole, clusterrole, binding, service account already exists in the cluster the RunE2E() test fails because the Create() operation returns error. But the error in the logs are not clear. Throw Fatal error when resource already exists and ask to run cleanup

Eg: when namespace already exists in the cluster I am getting below error

reeta@Reetas-MacBook-Pro hydrophone % make run
go run main.go
12:20:53 INF API endpoint : https://127.0.0.1:57154
12:20:53 INF Server version : version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-19T15:42:59Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/arm64"}
12:20:53 INF Using conformance image : 'registry.k8s.io/conformance:v1.24.0'
12:20:53 INF Using busybox image : 'registry.k8s.io/e2e-test-images/busybox:1.36.1-1'
12:20:53 INF Test framework will start '1' threads and use verbosity '4'
12:20:53 INF namespace already exist e2e-conformance-test
12:20:53 INF namespace created 

12:20:53 ERR an empty namespace may not be set during creation
exit status 1
make: *** [run] Error 1

To recreate this failure run make run twice

With this fix in place, it works fine

reeta@Reetas-MacBook-Pro hydrophone % make run
go run main.go
14:04:22 INF API endpoint : https://127.0.0.1:50108
14:04:22 INF Server version : version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-19T15:42:59Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/arm64"}
14:04:22 INF Using conformance image : 'registry.k8s.io/conformance:v1.24.0'
14:04:22 INF Using busybox image : 'registry.k8s.io/e2e-test-images/busybox:1.36.1-1'
14:04:22 INF Test framework will start '1' threads and use verbosity '4'
14:04:22 ERR namespace already exist conformance. Please run cleanup to start a fresh run
exit status 1
make: *** [run] Error 1
reeta@Reetas-MacBook-Pro hydrophone % 

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 27, 2024
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 27, 2024
if err != nil {
if errors.IsAlreadyExists(err) {
log.Printf("namespace already exist %s", common.PodName)
log.Printf("namespace already exist %s", conformanceNS.ObjectMeta.Name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, we were using wrong resource name in the log

@rjsadow
Copy link
Collaborator

rjsadow commented Jan 27, 2024

This looks good, but how does it account for the state of an object? For example after cleanup is run then the namespace will still exist in a terminating state.

@reetasingh
Copy link
Contributor Author

directory:
Executable path: /usr/local/bin/ginkgo
Args (comma-delimited): /usr/local/bin/ginkgo,--focus=[Conformance],--skip=,--noColor=true,/usr/local/bin/e2e.test,--,--disable-log-dump,--repo-root=/kubernetes,--provider=skeleton,--report-dir=/tmp/results,--kubeconfig=,%!s()
sep is " " args are "%!s()"split [%!s()]

This looks good, but how does it account for the state of an object? For example after cleanup is run then the namespace will still exist in a terminating state.

Thats true, we are not checking status of the resource here

@rakshitgondwal
Copy link
Member

Can we also rework the logs here? The logs are somewhat misleading if a resource already exists:

12:24:55 INF namespace already exist conformance
12:24:55 INF namespace created conformance

@dims
Copy link
Member

dims commented Jan 27, 2024

So, here's the thing, if the conformance namespace is already there then that means something that ran earlier will not be cleaned up. So we have to explicitly tell the user to clean the namespace by hand AND THEN re-run hydrophone.

If they don't and we try to reuse whatever exists, they will run into problems (tests may fail randomly etc)

So +1 to cleanup logs and be explicit about. BUT please don't let them continue when there is a namespace already that hydrophone did not create explicitly in the current run.

@dims
Copy link
Member

dims commented Jan 27, 2024

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 27, 2024
@reetasingh
Copy link
Contributor Author

So, here's the thing, if the conformance namespace is already there then that means something that ran earlier will not be cleaned up. So we have to explicitly tell the user to clean the namespace by hand AND THEN re-run hydrophone.

If they don't and we try to reuse whatever exists, they will run into problems (tests may fail randomly etc)

So +1 to cleanup logs and be explicit about. BUT please don't let them continue when there is a namespace already that hydrophone did not create explicitly in the current run.

Thanks @dims for the context. in that case we can just throw error if a resource already exists.

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 27, 2024
@reetasingh reetasingh changed the title Handle failure case when a k8s resource already exists in the cluster Log Fatalf and ask user to run cleanup when a k8s resource already exists in the cluster Jan 27, 2024
@reetasingh reetasingh changed the title Log Fatalf and ask user to run cleanup when a k8s resource already exists in the cluster throw fatal error and ask user to run cleanup when a k8s resource already exists in the cluster Jan 27, 2024
@reetasingh reetasingh changed the title throw fatal error and ask user to run cleanup when a k8s resource already exists in the cluster Throw fatal error and ask user to run cleanup when a k8s resource already exists in the cluster Jan 27, 2024
@reetasingh
Copy link
Contributor Author

@dims @rjsadow I have fixed the PR to just throw fatal error if resource already exists

@rjsadow
Copy link
Collaborator

rjsadow commented Jan 28, 2024

/lgtm
/hold for @dims review

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 28, 2024
@dims
Copy link
Member

dims commented Jan 28, 2024

/approve
please feel free to cancel hold

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, reetasingh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 28, 2024
@reetasingh
Copy link
Contributor Author

/hold remove

@reetasingh
Copy link
Contributor Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 29, 2024
@k8s-ci-robot k8s-ci-robot merged commit cd71d67 into kubernetes-sigs:main Jan 29, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants