ci also runs unit tests with -race enabled, but conditionally skips flaky tests #2705

zmt · 2022-01-29T07:14:27Z

Pull Request check list

Commit conforms to CONTRIBUTING.md
Proper tests/regressions included
Documentation updated

Affected functionality
Enable running the unit tests with race detector in CI/CD for PRs and releases as an additional github workflow.

Often fixing racy tests can be difficult and time-consuming. Some tests can hit negative interaction with the additional runtime overhead of the race detector, which can be impossible to resolve. To facilitate always running the majority of tests under the race detector:

add SkipFlakyTestUnderRaceConditionWithIssueFiled which does the following:
- matches arg against SPIRE issue URL regexp or fails test
- skips tests only when SKIP_FLAKY_TESTS_UNDER_RACE_DETECTOR is in the environment
add ci-race-test target in Makefile to set SKIP_FLAKY_TESTS_UNDER_RACE_DETECTOR and -race
add run_unit_tests_under_race_detector.sh and associated workflow configuration to run ci-race-test

Description of change
Add github workflow to run tests with race detector on PRs and releases. Add utility to skip specific tests during CI/CD with the race detector and associate the skip with a github issue.

Which issue this PR fixes
Fixes #2379.

azdagron · 2022-01-31T18:28:23Z

I'm a little hesitant to introduce a mechanism to skip racy tests (or any tests, for that matter):

Skipping tests is something easily done and then forgotten. Skipped tests are worthless tests.
CI/CD will ideally only be running the tests with -race, so skipped tests equate to zero coverage.
With the ability to skip, the pressure to fix the racy test is severely diminished.

I think we'd get better results long term if we direct efforts at fixing the racy tests and enable -race everywhere.

Curious how the other maintainers feel about it though.

pkg/agent/plugin/nodeattestor/tpmdevid/devid_test.go

zmt · 2022-01-31T19:47:33Z

reproducing flakiness of skipped tests to file detailed issues
adding comments referencing filed issues for all tests skipped under race detector
adding a comment in the SkipFlakyTest instructing to file issue and reference in comment

zmt · 2022-01-31T20:56:40Z

I'm a little hesitant to introduce a mechanism to skip racy tests (or any tests, for that matter):

Skipping tests is something easily done and then forgotten. Skipped tests are worthless tests.

100% agree in principle.

CI/CD will ideally only be running the tests with -race, so skipped tests equate to zero coverage.

I don't agree with this assertion per se. If we run the full suite without -race and only conditionally skip those that are known to be problematic under -race in the run with that enabled, that's better than waiting until we can resolve difficult timing issues in test objects which may or may not represent actual production runtime problems. The proposal in this PR is to do just that:

run full suite: make test
run almost full suite (less known issues run under -race): make race-test

With the ability to skip, the pressure to fix the racy test is severely diminished.

If we don't run any unit tests in CI/CD with -race at all, we don't stand a chance of catching newly introduced problems. I am in the process of filing issues for each of the skips in the PR. I was thinking about adding a validator to the skip utility to basically require an argument that looks like a filed issue or fail the test instead to help mitigate this risk, but wasn't sure if that would be too draconian.

I think we'd get better results long term if we direct efforts at fixing the racy tests and enable -race everywhere.

This might not be possible due to the differences between test binaries built with -race and those without.

Curious how the other maintainers feel about it though.

azdagron

Thanks for this, @zmt. We discussed this in our maintainer sync and the consensus was that this was a fine short-term fix while we get all the racy tests sorted and that we could remove it once that was accomplished.

Just a few comments :)

test/util/race.go

.github/workflows/scripts/run_unit_tests.sh

rturner3 · 2022-02-17T20:00:10Z

Hey @zmt, just checking in since there hasn't been movement on this PR in the last couple weeks. Do you think you'll have time to address the comments in the next week or two? If not, I would request we close the PR for now and reopen when this is ready for review.

Signed-off-by: Zack Train <ztrain@uber.com>

zmt · 2022-03-10T23:03:11Z

Hey @zmt, just checking in since there hasn't been movement on this PR in the last couple weeks. Do you think you'll have time to address the comments in the next week or two? If not, I would request we close the PR for now and reopen when this is ready for review.

I didn't see this until I was already gearing up to start on it again. I thought leaving it in draft status would be OK. I've picked it back up now and hope to make quick work of it. I will be leaving it in draft mode until ready.

Signed-off-by: Zack Train <ztrain@uber.com>

.github/workflows/pr_build.yaml

zmt · 2022-03-11T07:21:12Z

pkg/agent/plugin/nodeattestor/tpmdevid/devid_test.go

@@ -345,7 +345,7 @@ func TestConfigureWindows(t *testing.T) {
 }
 }

-func TestAidAttestationFailiures(t *testing.T) {


A little noise, but this test had been flaky when I started and the typo worked against me, so I decided to leave it a little better than I found it.

zmt · 2022-03-11T07:33:27Z

test/util/race.go

+ msg := "Skip only allowed with associated issue. "
+ msg += "%q does not appear to be an issue. "
+ msg += "File an issue and specify it to skip a test under race detector."


Perhaps this could use some additional wordsmithing...

azdagron

This is looking great. I like the enforcement that the skipped test be justified by providing an issue we track. That's a nice touch. Just one small comment on the name of the CI job and I think we're off to the races!

.github/workflows/pr_build.yaml

Co-authored-by: Andrew Harding <azdagron@gmail.com> Signed-off-by: Zack Train <ztrain@uber.com>

zmt · 2022-03-11T21:40:22Z

Apparently we have a flaky test even without the race detector which failed on macos :-(

azdagron

Thanks, @zmt!

zmt mentioned this pull request Jan 29, 2022

Set -race flag in CI unit test runs #2379

Closed

zmt force-pushed the issue/2379 branch from abb339a to 006d559 Compare January 29, 2022 07:56

zmt marked this pull request as ready for review January 29, 2022 08:26

zmt requested review from amartinezfayo, azdagron, evan2645 and rturner3 as code owners January 29, 2022 08:26

zmt commented Jan 31, 2022

View reviewed changes

pkg/agent/plugin/nodeattestor/tpmdevid/devid_test.go Outdated Show resolved Hide resolved

zmt marked this pull request as draft January 31, 2022 19:45

zmt changed the title ~~ci also runs unit tests with -race enabled, but skips flaky tests~~ ci also runs unit tests with -race enabled, but conditionally skips flaky tests Jan 31, 2022

zmt force-pushed the issue/2379 branch from a7e1a4c to d4ab974 Compare January 31, 2022 22:10

zmt force-pushed the issue/2379 branch 2 times, most recently from 90ceaea to 1f66324 Compare January 31, 2022 23:03

azdagron self-assigned this Feb 3, 2022

azdagron reviewed Feb 4, 2022

View reviewed changes

test/util/race.go Outdated Show resolved Hide resolved

test/util/race.go Outdated Show resolved Hide resolved

.github/workflows/scripts/run_unit_tests.sh Outdated Show resolved Hide resolved

.github/workflows/scripts/run_unit_tests.sh Outdated Show resolved Hide resolved

evan2645 added this to the 1.2.1 milestone Feb 8, 2022

azdagron modified the milestones: 1.2.1, 1.2.2 Mar 3, 2022

zmt added 5 commits March 10, 2022 22:49

Instrument race-test target with SKIP_FLAKY_TESTS.

9609818

Signed-off-by: Zack Train <ztrain@uber.com>

Add make race-test to run_unit_tests.sh after make test.

26498d8

Signed-off-by: Zack Train <ztrain@uber.com>

Skip another flaky test and minor cleanup.

d96cb0c

Signed-off-by: Zack Train <ztrain@uber.com>

Skip TestGenerateKey under race detector also.

3859439

Signed-off-by: Zack Train <ztrain@uber.com>

Add issue refs and clarify intent.

d2babd6

Signed-off-by: Zack Train <ztrain@uber.com>

Add GOVERBOSE to make race-test in run_unit_tests.sh.

c5e3d6c

Signed-off-by: Zack Train <ztrain@uber.com>

zmt force-pushed the issue/2379 branch from 1f66324 to c5e3d6c Compare March 10, 2022 22:52

fix erroneous assert import

aa181de

Signed-off-by: Zack Train <ztrain@uber.com>

zmt force-pushed the issue/2379 branch from 7a4ea89 to 724196c Compare March 11, 2022 02:15

zmt added 3 commits March 11, 2022 02:19

run unit-test-race-detector as separate workflow

f09a7ed

Signed-off-by: Zack Train <ztrain@uber.com>

make it fatal to skip without an issue

851c266

Signed-off-by: Zack Train <ztrain@uber.com>

remove skips for fixed tests

ecff992

Signed-off-by: Zack Train <ztrain@uber.com>

zmt force-pushed the issue/2379 branch from 724196c to ecff992 Compare March 11, 2022 02:20

zmt mentioned this pull request Mar 11, 2022

pkg/server/bundle/client.TestManagerOnDemandBundleRefresh flaky under race detector #2840

Closed

zmt added 3 commits March 11, 2022 03:41

skip TestManagerOnDemandBundleRefresh under race detector

bead7d0

Signed-off-by: Zack Train <ztrain@uber.com>

fix typo in Makefile

200032c

Signed-off-by: Zack Train <ztrain@uber.com>

skip TestAttestAgent under race detector

c66b5be

Signed-off-by: Zack Train <ztrain@uber.com>

zmt force-pushed the issue/2379 branch from 8c20d01 to c66b5be Compare March 11, 2022 04:32

zmt added 2 commits March 11, 2022 07:00

do not run unit-test-race-detector on macos

a342226

Signed-off-by: Zack Train <ztrain@uber.com>

clean up matrix since only linux for race detector

9f2f1f6

Signed-off-by: Zack Train <ztrain@uber.com>

zmt force-pushed the issue/2379 branch from 44abb48 to 9f2f1f6 Compare March 11, 2022 07:16

zmt commented Mar 11, 2022

View reviewed changes

.github/workflows/pr_build.yaml Outdated Show resolved Hide resolved

zmt commented Mar 11, 2022

View reviewed changes

zmt marked this pull request as ready for review March 11, 2022 07:27

zmt commented Mar 11, 2022

View reviewed changes

azdagron reviewed Mar 11, 2022

View reviewed changes

.github/workflows/pr_build.yaml Outdated Show resolved Hide resolved

Update .github/workflows/pr_build.yaml

9a2bf04

Co-authored-by: Andrew Harding <azdagron@gmail.com> Signed-off-by: Zack Train <ztrain@uber.com>

zmt force-pushed the issue/2379 branch from b2c068a to 9a2bf04 Compare March 11, 2022 20:58

zmt added 2 commits March 14, 2022 10:28

Merge branch 'spiffe:main' into issue/2379

2bad36b

Merge branch 'spiffe:main' into issue/2379

872fa21

azdagron approved these changes Mar 15, 2022

View reviewed changes

Merge branch 'main' into issue/2379

65beb52

azdagron merged commit 1b972f0 into spiffe:main Mar 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci also runs unit tests with -race enabled, but conditionally skips flaky tests #2705

ci also runs unit tests with -race enabled, but conditionally skips flaky tests #2705

zmt commented Jan 29, 2022 •

edited

Loading

azdagron commented Jan 31, 2022

zmt commented Jan 31, 2022

zmt commented Jan 31, 2022 •

edited

Loading

azdagron left a comment

rturner3 commented Feb 17, 2022

zmt commented Mar 10, 2022

zmt Mar 11, 2022

zmt Mar 11, 2022

azdagron left a comment

zmt commented Mar 11, 2022

azdagron left a comment

ci also runs unit tests with -race enabled, but conditionally skips flaky tests #2705

ci also runs unit tests with -race enabled, but conditionally skips flaky tests #2705

Conversation

zmt commented Jan 29, 2022 • edited Loading

azdagron commented Jan 31, 2022

zmt commented Jan 31, 2022

zmt commented Jan 31, 2022 • edited Loading

azdagron left a comment

Choose a reason for hiding this comment

rturner3 commented Feb 17, 2022

zmt commented Mar 10, 2022

zmt Mar 11, 2022

Choose a reason for hiding this comment

zmt Mar 11, 2022

Choose a reason for hiding this comment

azdagron left a comment

Choose a reason for hiding this comment

zmt commented Mar 11, 2022

azdagron left a comment

Choose a reason for hiding this comment

zmt commented Jan 29, 2022 •

edited

Loading

zmt commented Jan 31, 2022 •

edited

Loading