Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bazel: tweak logic for staging test artifacts in bazci #63767

Merged
merged 1 commit into from
Apr 23, 2021

Conversation

rickystewart
Copy link
Collaborator

Because Bazel aggressively caches build/test artifacts, if we're not
careful, bazci can copy OLD artifacts from previous build/test runs into
the artifacts directory. This is particularly an issue because TeamCity
watches that directory for test.xml reports, and if an old test.xml
reports a test failure, TeamCity will notice that and report the failure
in the UI -- and importantly, even if we replace that test.xml with a
completely different one that reports that the test succeeed, TC will
not amend what's displayed in the UI accordingly. So this can manifest
as reported test failures from unrelated PR's showing up in TC in an
apparently unpredictable (though uncommmon) manner.

We fix this by making bazci a little smarter about when we choose to
stage artifacts:

  1. The first time the watcher loops over all the test artifacts, never
    stage anything (the artifacts are probably cached -- not enough time
    has passed for any legitimate artifacts to appear, probably).
  2. Only stage artifacts incrementally if their stats have changed since
    the initial round of caching.
  3. During the final loop, stage ALL artifacts (if they haven't been
    staged yet), just to make sure we don't miss anything.

Also add lots of comments to make these design decisions and their
motivations a little clearer.

Resolves #63740.

Release note: None

@rickystewart rickystewart requested a review from rail April 15, 2021 22:21
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rickystewart rickystewart removed the request for review from rail April 15, 2021 22:43
@rickystewart
Copy link
Collaborator Author

Ah, I think this isn't behaving exactly how I wanted to, recalling for now :)

@rickystewart rickystewart added the do-not-merge bors won't merge a PR with this label. label Apr 15, 2021
@rickystewart rickystewart requested a review from rail April 20, 2021 21:29
@rickystewart rickystewart removed the do-not-merge bors won't merge a PR with this label. label Apr 20, 2021
@rickystewart
Copy link
Collaborator Author

days later, i realized it's because go switch statements don't work like they do in C 🙃

Copy link
Member

@rail rail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In overall it look good to me. There are a couple of small changes.

Reviewed 1 of 1 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rickystewart)


pkg/cmd/bazci/watch.go, line 73 at r1 (raw file):

	fileToStaged map[string]bool

I think you can use empty structs here - they are memory friendly than bool. https://dave.cheney.net/2014/03/25/the-empty-struct is a good read about those.


pkg/cmd/bazci/watch.go, line 253 at r1 (raw file):

}

var _ io.WriteCloser = (*cancelableWriter)(nil)

Did you add this to make sure we implement the interface? Can you add a short comment please.

Copy link
Collaborator Author

@rickystewart rickystewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail)


pkg/cmd/bazci/watch.go, line 253 at r1 (raw file):

Previously, rail (Rail Aliiev) wrote…

Did you add this to make sure we implement the interface? Can you add a short comment please.

Done.

@rickystewart rickystewart requested a review from rail April 23, 2021 16:41
Because Bazel aggressively caches build/test artifacts, if we're not
careful, bazci can copy OLD artifacts from previous build/test runs into
the artifacts directory. This is particularly an issue because TeamCity
watches that directory for `test.xml` reports, and if an old `test.xml`
reports a test failure, TeamCity will notice that and report the failure
in the UI -- and importantly, even if we replace that `test.xml` with a
completely different one that reports that the test succeeed, TC will
not amend what's displayed in the UI accordingly. So this can manifest
as reported test failures from unrelated PR's showing up in TC in an
apparently unpredictable (though uncommmon) manner.

We fix this by making bazci a little smarter about when we choose to
stage artifacts:

1. The first time the watcher loops over all the test artifacts, never
   stage anything (the artifacts are probably cached -- not enough time
   has passed for any legitimate artifacts to appear, probably).
2. Only stage artifacts incrementally if their stats have changed since
   the initial round of caching.
3. During the final loop, stage ALL artifacts (if they haven't been
   staged yet), just to make sure we don't miss anything.

Also add lots of comments to make these design decisions and their
motivations a little clearer.

Resolves cockroachdb#63740.

Release note: None
Copy link
Member

@rail rail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail)

@rickystewart
Copy link
Collaborator Author

bors r=rail

@craig
Copy link
Contributor

craig bot commented Apr 23, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Apr 23, 2021

Build succeeded:

@craig craig bot merged commit fe96523 into cockroachdb:master Apr 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bazel: TC can report failures to tests on previously-run builds on the same agent
3 participants