-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix context issue during cleanup of kind clusters #6771
base: main
Are you sure you want to change the base?
Conversation
c7011d3
to
057ba01
Compare
057ba01
to
8612cb2
Compare
8612cb2
to
e84b983
Compare
e84b983
to
560fe65
Compare
ci/kind/kind-setup.sh
Outdated
done | ||
done | ||
)200>>"$LOCK_FILE" | ||
rm -rf $LOCK_FILE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a good idea IMO. It feels like there can be a race condition where we delete the file while another job is holding the lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But untill we delete config.lock the other process will not be able to acquire the lock and it may panic.
Also since flock is not compatible with the Go based locking mechanism so what's you opinion on that should we use the alternative approach of writing the cluster names to a file and acquiring lock over that, or. should we move ahead with acquiring lock over the kubeconfig and use the below approach:
before invoking kind create cluster
command we can use flock to wait for the other process to release the lock and then we should trigger this command so in that case flock will not interfere with the Go based locking mechanism its just that we will be introducing couple of unnecessary flocks in the code.
@antoninbas what's your opinion on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not use ~/.kube/config.lock
, I think that's a given. We can pretend it doesn't exist.
So we need our own lock file. After that, we have 2 options, and you can choose which one you want to use:
- have our own state file to keep track of cluster names and creation timestamps (which I was originally proposing)
- only rely on kubectl / kind, and do not introduce our own state file. With this option we use flock to acquire a lock before calling kubectl / kind, as appropriate
I think both approaches will work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so for 2nd approach we again rely on ~/.kube/config.lock right? I think we can go ahead with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we don't rely on ~/.kube/config.lock
for either approach, as we have discussed about why this is not a viable option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally we have it on all our testbeds so this should not be an issue @XinShuYang and @KMAnju-2021 can confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, iirc we have this /var/lib/jenkins directory on all kind testbeds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it's good that you highlight this. /var/lib/jenkins/
is not an appropriate directory to use here as ci/kind/kind-setup.sh
is used for local development, Github CI, etc. Please use a more "universal" directory, such as ~/.kube/antrea/
(you can create the directory if it doesn't exist). I guess ~/.antrea/
would also be a good choice if we don't want to write anything to the ~/.kube
directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess ~/.antrea would be a better option here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antoninbas can you have a look at the code changes, thanks.
ebc163b
to
4fd57fe
Compare
Signed-off-by: Pulkit Jain <pulkit.jain@broadcom.com>
4fd57fe
to
13f0b92
Compare
Fix context issue during cleanup of kind clusters.
Fixes #6768.