Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve race in db tests #9117

Merged
merged 4 commits into from
Nov 25, 2021
Merged

Resolve race in db tests #9117

merged 4 commits into from
Nov 25, 2021

Conversation

tcsc
Copy link
Contributor

@tcsc tcsc commented Nov 24, 2021

When disconnecting a client due to an expired certificate, the monitor
was emitting the disconnection event before forcing the client to
disconnect. This led to races in tests that wait on the disconnection
event, where tests would:

  1. wait for the disconnection event before proceeding
  2. receive the disconnection event
  3. take action based on the disconnection event, assuming that the
    client was disconnected, when there is no guarantee that the
    disconnection has actually happened yet.

This patch re-orders the disconnection and event broadcast, such that
the disconnection happens first, meaning that if a watcher receives
the disconnection event, then the disconnection has already been
attempted.

When disconnecting a client due to an expired certificate, the monitor
was emitting the disconnection event before forcing the client to
disconnect. This led to races in tests that wait on the disonnection
event, where tests would:

 1. wait for the disconnection event before proceeding
 2. receive the disconnection event
 2. take action based on the disconnection event, assuming that the
    client was disconnected, when there is no guarantee that the
    disconnection has actually happened yet.

This patch re-orders the disconnection and event broadcast, such that
the disconnection happens _first_, meaning that if a watcher receives
the doisconnection event, then the dosconnection has already been
attempted.
@codingllama
Copy link
Contributor

FYI, fixed a few typos in the description.

@tcsc tcsc enabled auto-merge (squash) November 24, 2021 23:06
@tcsc tcsc merged commit e8e4c8e into master Nov 25, 2021
@tcsc tcsc deleted the tcsc/db-disconnect-race branch November 25, 2021 01:10
tcsc added a commit that referenced this pull request Dec 1, 2021
When disconnecting a client due to an expired certificate, the monitor
was emitting the disconnection event before forcing the client to
disconnect. This led to races in tests that wait on the disconnection
event, where tests would:

    1. wait for the disconnection event before proceeding
    2. receive the disconnection event
    3. take action based on the disconnection event, assuming that the
        client was disconnected, when there is no guarantee that the
        disconnection has actually happened yet.

This patch re-orders the disconnection and event broadcast, such that
the disconnection happens first, meaning that if a watcher receives
the disconnection event, then the disconnection has already been
attempted.
tcsc added a commit that referenced this pull request Dec 1, 2021
When disconnecting a client due to an expired certificate, the monitor
was emitting the disconnection event before forcing the client to
disconnect. This led to races in tests that wait on the disconnection
event, where tests would:

    1. wait for the disconnection event before proceeding
    2. receive the disconnection event
    3. take action based on the disconnection event, assuming that the
        client was disconnected, when there is no guarantee that the
        disconnection has actually happened yet.

This patch re-orders the disconnection and event broadcast, such that
the disconnection happens first, meaning that if a watcher receives
the disconnection event, then the disconnection has already been
attempted.
tcsc added a commit that referenced this pull request Dec 1, 2021
Part of this change is implementing a "no secrets" policy for CI. Given that

  1.  we have to support CI for arbitrary external contributors, and
  2.  it is easy to craft a malicious PR that exfiltrates secrets during a CI build

any test that runs under CI must be able to do so without any injected secrets.

This means that several of the test we currently run under Drone will not be 
run on GCB, at least as part of the regular CI. The plan is to create a separate
task that periodically runs tests that require external credentials (e.g. Kube tests,
various backend data stores, etc.) in a more secure way and report failures
asynchronously. And while these tests will not run under CI, the should still be
built under CI so that required changes are caught during review.

Note: this backport includes various data race fixes added separately in the master branch:

See-Also: #8643
See-Also: #8888
See-Also: #9117
See-Also: #9119
tcsc added a commit that referenced this pull request Dec 16, 2021
Part of this change is implementing a "no secrets" policy for CI. Given that

    we have to support CI for arbitrary external contributors, and
    it is easy to craft a malicious PR that exfiltrates secrets during a CI build

any test that runs under CI must be able to do so without any injected secrets.

This means that several of the test we currently run under Drone will not be run
on GCB, at least as part of the regular CI. The plan is to create a separate task
that periodically runs tests that require external credentials (e.g. Kube tests,
various backend data stores, etc.) in a more secure way and report failures
asynchronously. And while these tests will not run under CI, the should still be
built under CI so that required changes are caught during review.

See-Also: #8608
See-Also: #8643
See-Also: #8888
See-Also: #9117
See-Also: #9119
@russjones russjones mentioned this pull request Dec 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants