Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: mark test-fs-rmdir-recursive flaky on win #41533

Closed
wants to merge 1 commit into from

Conversation

mhdawson
Copy link
Member

Refs: #41201

From recent reliability reports this is now the most
common failure by far in CI runs. Mark the test as
flaky until the issue is resolved.

Signed-off-by: Michael Dawson mdawson@devrus.com

Refs: nodejs#41201

From recent reliability reports this is now the most
common failure by far in CI runs. Mark the test as
flaky until the issue is resolved.

Signed-off-by: Michael Dawson <mdawson@devrus.com>
@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. test Issues and PRs related to the tests. labels Jan 14, 2022
@lpinca
Copy link
Member

lpinca commented Jan 15, 2022

The issue might be fixed by #41545 so I would wait before landing this.

@mhdawson
Copy link
Member Author

@lpinca thanks, landed #41545. Will see if the problem still recreates before landing this.

@mhdawson
Copy link
Member Author

@bcoe any chance you could have a quick look at parallel/test-fs-rm to see if it has similar issues. It is also flaky but does not fail quite as often.

parallel/test-fs-rm

@bcoe
Copy link
Contributor

bcoe commented Jan 17, 2022

@mhdawson I did notice a couple missing awaits in test-fs-rm.js, which were included in this PR:

6b9d2ae#diff-596c3dc1f2c2b0c3dc8f6c087b9e0f1910188ff28eaee22800367acc0e9a4f43L188

I'm not 100% sure they're the cause of flakes. But I could imagine them causing some weirdness depending on timing. It seemed like the step that fails is actually the tmpdir helper attempting to cleanup after itself, perhaps the cleanup step fails on some operating systems because there are other processes attempting file operations at the same time?

Should we keep this open for a little bit and see if the flakes have gone away? How often were we seeing the failures.

@mhdawson
Copy link
Member Author

Should we keep this open for a little bit and see if the flakes have gone away? How often were we seeing the failures.

@mhdawson
Copy link
Member Author

Should we keep this open for a little bit and see if the flakes have gone away? How often were we seeing the failures.

That was my plan. I'll leave open for a week or so to see how the flakes look on new PRs. If things are green (keeping my fingers crossed) then I'll go ahead and close this.

@mhdawson
Copy link
Member Author

mhdawson commented Jan 18, 2022

@bcoe I'll also say that today things look better in the CI so I'm hopeful. Thanks for your help on this one.

@bcoe
Copy link
Contributor

bcoe commented Jan 22, 2022

@mhdawson if I'm reading the daily reports correctly, I'm still seeing quite a few flakes in test-fs-rmdir-recursive.

One thought I had was serializing the tests, so that the promise API and callback API are not being exercised a the same time -- my hunch continues to be parallel operations causing issues in Windows due to contention on files.

In the very least, perhaps we will get a clearer picture of which test is failing to cleanup.

@mhdawson
Copy link
Member Author

There have been a few failures within the last few days, but they might be from PRs that were not rebased? I figure we should wait another week and then take a look at the reliability report again.

@mhdawson
Copy link
Member Author

@bcoe unfortunately from the latest reliability report - nodejs/reliability#185. It still looks like failures are occurring with parallel/test-fs-rmdir-recursive. The latest failure was just yesterday and the PR being tested was only opened yesterday as well.

@lpinca I'm going to propose we land this PR. I'll agree to keep an eye on the CI and if we no longer see the test failing in the next week or so I'll back out the change, otherwise we can do that as part of whatever future fixes/updates there are to make the test more reliable on windows.

@lpinca
Copy link
Member

lpinca commented Jan 29, 2022

@mhdawson I agree, go ahead.

@lpinca lpinca added the request-ci Add this label to start a Jenkins CI on a PR. label Jan 29, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jan 29, 2022
@nodejs-github-bot
Copy link
Collaborator

mhdawson added a commit that referenced this pull request Jan 31, 2022
Refs: #41201

From recent reliability reports this is now the most
common failure by far in CI runs. Mark the test as
flaky until the issue is resolved.

Signed-off-by: Michael Dawson <mdawson@devrus.com>

PR-URL: #41533
Reviewed-By: Ben Coe <bencoe@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
@mhdawson
Copy link
Member Author

Landed in 7faf763

@mhdawson mhdawson closed this Jan 31, 2022
ruyadorno pushed a commit that referenced this pull request Feb 8, 2022
Refs: #41201

From recent reliability reports this is now the most
common failure by far in CI runs. Mark the test as
flaky until the issue is resolved.

Signed-off-by: Michael Dawson <mdawson@devrus.com>

PR-URL: #41533
Reviewed-By: Ben Coe <bencoe@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
danielleadams pushed a commit that referenced this pull request Mar 2, 2022
Refs: #41201

From recent reliability reports this is now the most
common failure by far in CI runs. Mark the test as
flaky until the issue is resolved.

Signed-off-by: Michael Dawson <mdawson@devrus.com>

PR-URL: #41533
Reviewed-By: Ben Coe <bencoe@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
danielleadams pushed a commit that referenced this pull request Mar 3, 2022
Refs: #41201

From recent reliability reports this is now the most
common failure by far in CI runs. Mark the test as
flaky until the issue is resolved.

Signed-off-by: Michael Dawson <mdawson@devrus.com>

PR-URL: #41533
Reviewed-By: Ben Coe <bencoe@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
danielleadams pushed a commit that referenced this pull request Mar 14, 2022
Refs: #41201

From recent reliability reports this is now the most
common failure by far in CI runs. Mark the test as
flaky until the issue is resolved.

Signed-off-by: Michael Dawson <mdawson@devrus.com>

PR-URL: #41533
Reviewed-By: Ben Coe <bencoe@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-ci PRs that need a full CI run. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants