Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share SingleFileTest publishing between test classes #45523

Merged
merged 5 commits into from
Dec 8, 2020

Conversation

agocke
Copy link
Member

@agocke agocke commented Dec 3, 2020

Tests in the AppHost.Bundle.Tests assembly seem to randomly fail due to a race condition
with the file system. They try to create separate '0','1','2'... subdirectories to isolate
the published files for each test, but I think what's happening is that files may be
marked for deletion, but then not deleted until a later write. For instance, files in
'2' may be marked for deletion and some may fail a File.Exists check, which leads to
'2' being recreated, at which point deletion may occur, which will cause the current test
to fail due to a concurrent write operation.

This change tries to simplify the system by sharing the test state across all the classes
in the assembly, instead of per-class, and then cleaning up only when all of them are
finished executing.

Fixes #43316

Tests in the AppHost.Bundle.Tests assembly seem to randomly fail due to a race condition
with the file system. They try to create separate '0','1','2'... subdirectories to isolate
the published files for each test, but I think what's happening is that files may be
marked for deletion, but then not deleted until a later write. For instance, files in
'2' may be marked for deletion and some may fail a File.Exists check, which leads to
'2' being recreated, at which point deletion may occur, which will cause the current test
to fail due to a concurrent write operation.

This change tries to simplify the system by sharing the test state across all the classes
in the assembly, instead of per-class, and then cleaning up only when all of them are
finished executing.

Fixes dotnet#43316
@ghost
Copy link

ghost commented Dec 3, 2020

Tagging subscribers to this area: @agocke, @vitek-karas
See info in area-owners.md if you want to be subscribed.

Issue Details

Tests in the AppHost.Bundle.Tests assembly seem to randomly fail due to a race condition
with the file system. They try to create separate '0','1','2'... subdirectories to isolate
the published files for each test, but I think what's happening is that files may be
marked for deletion, but then not deleted until a later write. For instance, files in
'2' may be marked for deletion and some may fail a File.Exists check, which leads to
'2' being recreated, at which point deletion may occur, which will cause the current test
to fail due to a concurrent write operation.

This change tries to simplify the system by sharing the test state across all the classes
in the assembly, instead of per-class, and then cleaning up only when all of them are
finished executing.

Fixes #43316

Author: agocke
Assignees: -
Labels:

area-Single-File

Milestone: -

@vitek-karas
Copy link
Member

I think that if your theory is correct then we should actually change the way TestFixtures are created in the test infra - instead of trying to reuse folders (1, 2, ...) we should generate a random unique name (GUID?) each time - if one of the fixtures fails to delete itself properly due to file system races it will leave garbage on disk, but it will NOT fail the tests.

@agocke
Copy link
Member Author

agocke commented Dec 3, 2020

I can do that too -- GUID works but is probably overkill tbh. There's a GetTempFileName that's probably sufficient for lower levels of contention.

I'll add that in a second commit

@elinor-fung
Copy link
Member

+1 for @vitek-karas's suggestion for changing how we create the new path for test fixtures. If we do that, we shouldn't need to have the collection grouping, right? (So we can keep running the tests in parallel to avoid increasing test run times?)

That approach should also handle other cases like the StaticHost tests (failed in this PR: https://dev.azure.com/dnceng/public/_build/results?buildId=906837&view=ms.vss-test-web.build-test-results-tab&runId=28928456&resultId=100001&paneView=debug) - assuming your theory applies to that as well

@danmoseley
Copy link
Member

@JeremyKuhne just curious, does this theory match your knowledge of how filesystems can work on Linux? I remember something similar on Windows.

@JeremyKuhne
Copy link
Member

does this theory match your knowledge of how filesystems can work on Linux?

Historically Windows marked files for deletion and would remove them once all file handles were closed. That was recently changed so that the file is immediately moved to a new, "hidden" entry to open up the named location, specifically to better support Linux scenarios.

File.Exists is particularly evil, as @jaredpar will happily attest to. It returns true for "this existed at some point after you asked me" and false for "I can't tell you if it exists". It's use is often a red flag that code will not handle contention. :)

Best practice for tests is to always use a randomly generated sub-folder name. Path.GetRandomFileName() is a safe enough bet to avoid contention and only needs 12 characters (vs 36 with a guid).

@agocke agocke merged commit cbc10fa into dotnet:master Dec 8, 2020
@agocke agocke deleted the flaky-apphost-test branch December 8, 2020 03:21
@ghost ghost locked as resolved and limited conversation to collaborators Jan 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bundle_Is_Extracted test failing intermittently on Linux
6 participants