-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make flaky test less sensitive #19976
Conversation
this looks great to me, but I'm pretty short on context for this part of the codebase so I'm shying away from approval. me and Michael had a DM conversation about this. I think my team can do more to set expectations around responsibility. I'll have a more clear DRI and better process going forward. Thanks for helping us with the first run |
final long numAttempts = apiClient.getJobsApi() | ||
.getJobInfo(new JobIdRequestBody().id(connectionSyncRead.getJob().getId())) | ||
.getAttempts() | ||
.size(); | ||
|
||
// it should be able to accomplish the resume without an additional attempt! | ||
assertEquals(1, numAttempts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason this assertion was being made was that we explicitly wanted to test that if the worker is scaled down and back up in the middle of a sync (e.g. deploying a new change to cloud prod), then those syncs will NOT be failed but will instead continue on as normal and succeed.
If we get rid of this assertion, it could mean that a regression where scaling down a working in the middle of a sync actually causes that sync to fail will go unnoticed.
@mfsiega-airbyte Do we have a sense of why these sync attempts are failing in CI? I feel like we should be able to rely on the fact that our acceptance test sync attempts are succeeding, otherwise other tests will likely be breaking as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my understanding - bear in mind that I don't actually have any context on this test, beyond looking at it today.
This test (even after this PR) does validate that the sync succeeds even if the worked is scaled down and back up. Note that in the preceding line we waitForSuccessfulJob
- if the sync fails, waitForSuccessfulJob
will eventually throw and the test will fail.
What the assertion -- deleted by this PR -- is verifying is that it succeeds within a single attempt. Is there a reason that it's important that it succeeds within a single attempt? And in other places where our acceptance tests look for successful syncs, do they also verify that it succeeds within a single attempt?
If not, it's hard to see why we're verifying that here (except I guess that it's a cool implementation detail).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point about our other acceptance tests.
I think this was trying to ensure that scaling down and up doesn't actually result in any "failures" at all, whether at the attempt or the overall sync level (i.e. it should be invisible to the user that the scaling down/up happened at all). But since our other tests are not ensuring this, it probably doesn't make sense to do that here. And as long as the overall sync succeeds, that's the more important part.
Given that, I think I'm fine with this change. But it might be worthwhile to separately look into why these attempts are failing, since I think that is still unexpected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree on all of it. I see what you mean about scaling being invisible.
I'll go ahead and merge this but agree we should understand why some attempts are failing, and ideally we'd be able to re-enable these.
What
Make flaky test less sensitive.
How
Don't insist that only a single job be run. If the sync needed a retry for any other reason then the test will fail. This is too sensitive, and the flakiness is interfering with normal dev process.