Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deflake TestAgentForward #13166

Merged
merged 4 commits into from
Jun 9, 2022
Merged

Deflake TestAgentForward #13166

merged 4 commits into from
Jun 9, 2022

Conversation

zmb3
Copy link
Collaborator

@zmb3 zmb3 commented Jun 3, 2022

This was another case of the t.TempDir() being cleaned up while the
audit logger is still writing to the directory, which happens when
tests don't properly clean up after themselves. Ensure that any
services we spin up closed via cleanup actions.

Updates #9492
Fixes #5333

@github-actions github-actions bot requested review from Joerger and smallinsky June 3, 2022 22:11
@zmb3
Copy link
Collaborator Author

zmb3 commented Jun 3, 2022

Looks like this still doesn't quite solve it, I'm still seeing some flakiness.

lib/srv/regular/sshserver_test.go Outdated Show resolved Hide resolved
@zmb3 zmb3 force-pushed the zmb3/deflake-test-agent-forward branch from 447dff0 to 4390ecb Compare June 6, 2022 21:30
@zmb3
Copy link
Collaborator Author

zmb3 commented Jun 6, 2022

@Joerger and @r0mant I think I've fixed this for good now. See latest commit.

@zmb3 zmb3 force-pushed the zmb3/deflake-test-agent-forward branch from 4390ecb to 7eb858c Compare June 6, 2022 21:32
@zmb3 zmb3 changed the title Attempt to deflake TestAgentForward Deflake TestAgentForward Jun 6, 2022
This was another case of the t.TempDir() being cleaned up while the
audit logger is still writing to the directory, which happens when
tests don't properly clean up after themselves. Ensure that any
services we spin up closed via cleanup actions.
@zmb3 zmb3 force-pushed the zmb3/deflake-test-agent-forward branch 2 times, most recently from 53b0a17 to 0b8a770 Compare June 7, 2022 00:37
@zmb3 zmb3 marked this pull request as draft June 7, 2022 02:58
@zmb3 zmb3 force-pushed the zmb3/deflake-test-agent-forward branch from 0b8a770 to 9ae3642 Compare June 9, 2022 14:28
@zmb3 zmb3 marked this pull request as ready for review June 9, 2022 14:28
@zmb3 zmb3 requested review from Joerger and r0mant June 9, 2022 14:28
@github-actions github-actions bot added the audit-log Issues related to Teleports Audit Log label Jun 9, 2022
@github-actions github-actions bot requested a review from xacrimon June 9, 2022 14:29
@zmb3
Copy link
Collaborator Author

zmb3 commented Jun 9, 2022

This fixes the flakiness, but I noticed a separate issue which I've filed here: #13335

The disk-based logger runs a background process to complete uploads,
which occaisionally fails to finish before the test cleanup tries
to remove the temporary directory.

There are two ways to prevent the use of a disk based logger:

1. Set IsTestStub on the SSH *ServerContext
2. Use a sync session recording mode

Option 2 was selected, because the ServerContext is created by the SSH
server instead of the test, so plumbing that value through would be
a larger change, and I generally dislike test specific modes that can
be mistakenly enabled in non-test situations.

Additionally, update the lib/srv/regular test fixture to allow for
configuring the audit log to use. This allows us to set up a dicarding
logger, since these tests are about agent forwarding behavior and not
audit logging.
@zmb3 zmb3 force-pushed the zmb3/deflake-test-agent-forward branch from 29f0a6e to f9c7353 Compare June 9, 2022 15:43
@zmb3 zmb3 merged commit bd78561 into master Jun 9, 2022
@github-actions
Copy link

github-actions bot commented Jun 9, 2022

@zmb3 See the table below for backport results.

Branch Result
branch/v8 Failed
branch/v9 Failed

zmb3 added a commit that referenced this pull request Jun 9, 2022
* Attempt to deflake TestAgentForward

This was another case of the t.TempDir() being cleaned up while the
audit logger is still writing to the directory, which happens when
tests don't properly clean up after themselves. Ensure that any
services we spin up closed via cleanup actions.

* Prevent the use of disk-based logging in TestAgentForward

The disk-based logger runs a background process to complete uploads,
which occaisionally fails to finish before the test cleanup tries
to remove the temporary directory.

There are two ways to prevent the use of a disk based logger:

1. Set IsTestStub on the SSH *ServerContext
2. Use a sync session recording mode

Option 2 was selected, because the ServerContext is created by the SSH
server instead of the test, so plumbing that value through would be
a larger change, and I generally dislike test specific modes that can
be mistakenly enabled in non-test situations.

Additionally, update the lib/srv/regular test fixture to allow for
configuring the audit log to use. This allows us to set up a dicarding
logger, since these tests are about agent forwarding behavior and not
audit logging.
zmb3 added a commit that referenced this pull request Jun 9, 2022
* Attempt to deflake TestAgentForward

This was another case of the t.TempDir() being cleaned up while the
audit logger is still writing to the directory, which happens when
tests don't properly clean up after themselves. Ensure that any
services we spin up closed via cleanup actions.

* Prevent the use of disk-based logging in TestAgentForward

The disk-based logger runs a background process to complete uploads,
which occaisionally fails to finish before the test cleanup tries
to remove the temporary directory.

There are two ways to prevent the use of a disk based logger:

1. Set IsTestStub on the SSH *ServerContext
2. Use a sync session recording mode

Option 2 was selected, because the ServerContext is created by the SSH
server instead of the test, so plumbing that value through would be
a larger change, and I generally dislike test specific modes that can
be mistakenly enabled in non-test situations.

Additionally, update the lib/srv/regular test fixture to allow for
configuring the audit log to use. This allows us to set up a dicarding
logger, since these tests are about agent forwarding behavior and not
audit logging.
zmb3 added a commit that referenced this pull request Jun 10, 2022
* Deflake TestAgentForward (#13166)

* Attempt to deflake TestAgentForward

This was another case of the t.TempDir() being cleaned up while the
audit logger is still writing to the directory, which happens when
tests don't properly clean up after themselves. Ensure that any
services we spin up closed via cleanup actions.

* Prevent the use of disk-based logging in TestAgentForward

The disk-based logger runs a background process to complete uploads,
which occaisionally fails to finish before the test cleanup tries
to remove the temporary directory.

There are two ways to prevent the use of a disk based logger:

1. Set IsTestStub on the SSH *ServerContext
2. Use a sync session recording mode

Option 2 was selected, because the ServerContext is created by the SSH
server instead of the test, so plumbing that value through would be
a larger change, and I generally dislike test specific modes that can
be mistakenly enabled in non-test situations.

Additionally, update the lib/srv/regular test fixture to allow for
configuring the audit log to use. This allows us to set up a dicarding
logger, since these tests are about agent forwarding behavior and not
audit logging.

* Fix failures

Some tests are failing due to us now setting a clock on our auditlog.
Get closer to the original behavior on this branch by:

- not setting the clock
- not using a checking emitter

Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
@zmb3 zmb3 deleted the zmb3/deflake-test-agent-forward branch June 10, 2022 22:46
zmb3 added a commit that referenced this pull request Jun 13, 2022
* Attempt to deflake TestAgentForward

This was another case of the t.TempDir() being cleaned up while the
audit logger is still writing to the directory, which happens when
tests don't properly clean up after themselves. Ensure that any
services we spin up closed via cleanup actions.

* Prevent the use of disk-based logging in TestAgentForward

The disk-based logger runs a background process to complete uploads,
which occaisionally fails to finish before the test cleanup tries
to remove the temporary directory.

There are two ways to prevent the use of a disk based logger:

1. Set IsTestStub on the SSH *ServerContext
2. Use a sync session recording mode

Option 2 was selected, because the ServerContext is created by the SSH
server instead of the test, so plumbing that value through would be
a larger change, and I generally dislike test specific modes that can
be mistakenly enabled in non-test situations.

Additionally, update the lib/srv/regular test fixture to allow for
configuring the audit log to use. This allows us to set up a dicarding
logger, since these tests are about agent forwarding behavior and not
audit logging.
zmb3 added a commit that referenced this pull request Jun 13, 2022
* Attempt to deflake TestAgentForward

This was another case of the t.TempDir() being cleaned up while the
audit logger is still writing to the directory, which happens when
tests don't properly clean up after themselves. Ensure that any
services we spin up closed via cleanup actions.

* Prevent the use of disk-based logging in TestAgentForward

The disk-based logger runs a background process to complete uploads,
which occaisionally fails to finish before the test cleanup tries
to remove the temporary directory.

There are two ways to prevent the use of a disk based logger:

1. Set IsTestStub on the SSH *ServerContext
2. Use a sync session recording mode

Option 2 was selected, because the ServerContext is created by the SSH
server instead of the test, so plumbing that value through would be
a larger change, and I generally dislike test specific modes that can
be mistakenly enabled in non-test situations.

Additionally, update the lib/srv/regular test fixture to allow for
configuring the audit log to use. This allows us to set up a dicarding
logger, since these tests are about agent forwarding behavior and not
audit logging.

Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
audit-log Issues related to Teleports Audit Log flaky tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test flakes: lib/srv/regular, time-sensitive assertion
4 participants