🐛 [relase-0.8] Fix a race condition between leader election and recorder #1381

vincepri · 2021-02-10T19:40:43Z

This change introduces better syncronization between the leader election
code and the event recorder. Running tests with -race flag, we often saw
a panic on a closed channel, the channel was the one that the event
recorder was using internally.

After digging more through the code, it seems that we weren't properly
waiting for leader election code to stop completely, but instead we were
only calling the cancel() function asking the leader election to stop.

With this change, during a shutdown, we now wait for leader election to
finish up any internal task before we return and close an internal
channel. Only after leader election signals that the channel has been
closed, and Run(...) has properly returned, we return execution to the
stop procedure, where the event recorder is then stopped.

Backport for #1379 for the release-0.8 branch

This change introduces better syncronization between the leader election code and the event recorder. Running tests with -race flag, we often saw a panic on a closed channel, the channel was the one that the event recorder was using internally. After digging more through the code, it seems that we weren't properly waiting for leader election code to stop completely, but instead we were only calling the cancel() function asking the leader election to stop. With this change, during a shutdown, we now wait for leader election to finish up any internal task before we return and close an internal channel. Only after leader election signals that the channel has been closed, and Run(...) has properly returned, we return execution to the stop procedure, where the event recorder is then stopped. Signed-off-by: Vince Prignano <vincepri@vmware.com>

Signed-off-by: Vince Prignano <vincepri@vmware.com>

vincepri · 2021-02-10T19:41:04Z

/milestone v0.8.x

vincepri · 2021-02-10T19:41:14Z

/assign @alvaroaleman @christopherhein

alvaroaleman

/lgtm

k8s-ci-robot · 2021-02-10T20:03:07Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [alvaroaleman,vincepri]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

alvaroaleman · 2021-02-10T20:03:23Z

We have no tests on this branch :/

vincepri · 2021-02-10T20:04:18Z

🤔 Seems we need to fix that in test infra

vincepri added 2 commits February 10, 2021 11:38

Only cancel leader election if the runnables have shutdown

faecdc7

Signed-off-by: Vince Prignano <vincepri@vmware.com>

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 10, 2021

k8s-ci-robot requested review from alenkacz and droot February 10, 2021 19:40

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 10, 2021

k8s-ci-robot added this to the v0.8.x milestone Feb 10, 2021

k8s-ci-robot assigned alvaroaleman and christopherhein Feb 10, 2021

alvaroaleman changed the title ~~🐛 Fix a race condition between leader election and recorder~~ 🐛 [relase-0.8] Fix a race condition between leader election and recorder Feb 10, 2021

alvaroaleman approved these changes Feb 10, 2021

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 10, 2021

k8s-ci-robot merged commit a8c19c4 into kubernetes-sigs:release-0.8 Feb 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 [relase-0.8] Fix a race condition between leader election and recorder #1381

🐛 [relase-0.8] Fix a race condition between leader election and recorder #1381

Uh oh!

vincepri commented Feb 10, 2021

Uh oh!

vincepri commented Feb 10, 2021

Uh oh!

vincepri commented Feb 10, 2021

Uh oh!

alvaroaleman left a comment

Uh oh!

k8s-ci-robot commented Feb 10, 2021

Uh oh!

alvaroaleman commented Feb 10, 2021

Uh oh!

vincepri commented Feb 10, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

🐛 [relase-0.8] Fix a race condition between leader election and recorder #1381

🐛 [relase-0.8] Fix a race condition between leader election and recorder #1381

Uh oh!

Conversation

vincepri commented Feb 10, 2021

Uh oh!

vincepri commented Feb 10, 2021

Uh oh!

vincepri commented Feb 10, 2021

Uh oh!

alvaroaleman left a comment

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Feb 10, 2021

Uh oh!

alvaroaleman commented Feb 10, 2021

Uh oh!

vincepri commented Feb 10, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants