-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to make functional tests more reliable #5874
Conversation
<MaxParallelThreads>4</MaxParallelThreads> | ||
|
||
<!-- Set this to true if this assembly is willing to participate in parallelization with other assemblies. Test runners can use this information to automatically enable parallelization across assemblies if all the assemblies agree to it. --> | ||
<ParallelizeAssembly>false</ParallelizeAssembly> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this setting nullifies the MaxcpuCount (4)
property because as per the comment it looks like MaxcpuCount specifies the number of distinct DLLs in the run determines the actual number of testhosts started.
. Given that the multiple assemblies are not running parallel then my understanding is that only one test host process is started. I guess we should change to true so that 4 DLLs can run in parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since MaxCpuCount
is set to 4
(here), there will be 4 vstest.console.exe
processes spawned. I think it then launches each test assembly in those processes in parallel. Then with MaxParallelThreads
set to 4
, each test assembly runs with 4 threads (if there are enough CPUs).
This is really confusing since there are multiple test runners. You can run tests under xunit.runner.exe
, vstest.console.exe
, etc. I think the ParallelizeAssembly
tells xunit specifically that it should run tests from multiple assemblies at the same time when running under certain test runners. But since we're running the tests under vstest.console.exe
, it seems to me that we should not be using xunit's parallelization. I spent a lot of time read https://xunit.net/docs/runsettings but only experimenting with these numbers did I find a good balance between speed and reliability. Do you know more about how all of this works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like a similar issue has been discussed here. I think we are good because it appears that xUnit parallelization is ignored by the dotnet test
command. One interesting thing I learned from this link is that executing dotnet test
without specifying assemblies runs test assemblies in parallel, whereas dotnet test assembly1.dll assembly2.dll
does not run test assemblies in parallel. I am not sure if this is accurate in .NET 8, but at least that was the case earlier because the linked comment explained the different behaviors with an example.
965bbb3
to
e6aabcc
Compare
e6aabcc
to
ce678f8
Compare
@@ -53,6 +53,9 @@ public static CommandRunnerResult Run(string filename, string workingDirectory = | |||
process.StartInfo.Environment["NUGET_SHOW_STACK"] = bool.TrueString; | |||
process.StartInfo.Environment["NuGetTestModeEnabled"] = bool.TrueString; | |||
process.StartInfo.Environment["UseSharedCompilation"] = bool.FalseString; | |||
process.StartInfo.Environment["DOTNET_SKIP_FIRST_TIME_EXPERIENCE"] = bool.TrueString; | |||
process.StartInfo.Environment["DOTNET_CLI_TELEMETRY_OPTOUT"] = bool.TrueString; | |||
process.StartInfo.Environment["SuppressNETCoreSdkPreviewMessage"] = bool.FalseString; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a mistake? Why wouldn't we want to suppress the .NET (Core) preview message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 Yes, thank you for catching that!
@@ -78,7 +78,7 @@ jobs: | |||
condition: succeeded() | |||
inputs: | |||
command: test | |||
arguments: --no-restore --restore:false --no-build --framework $(TestTargetFramework) --blame --logger trx --binarylogger:"$(BinlogDirectory)02-RunTests.binlog" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, I've disabled the crash dump helper when running tests since all of the memory dumps it provided were always for the main vstest.console.exe which wasn't helpful since the tests themselves are running in a different process
Have you talked to the team who owns dotnet test
? This sounds very surprising so sounds like something that should be fixed for all customers, or if we're doing something wrong, they can help us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a few issues in the vstest repo:
microsoft/vstest#3936
microsoft/vstest#2593
microsoft/vstest#4376
microsoft/vstest#2952
Nothing specific to the problem I was seeing. Supposedly the blame stuff also doesn't supply enough command-line arguments to capture other exceptions than StackOverflowException
and AccessViolationException
Although now there's an env var we can set to include more arguments to proc dump. The main point in all of this is it seems like adding --diag
to the vstest.console.exe
execution is better. I don't think we have that many "test crashes". But if we feel like we need the memory dumps back, we can do so at a later time in my opinion.
<ParallelizeAssembly>false</ParallelizeAssembly> | ||
|
||
<!-- Set this to true if the assembly is willing to run tests inside this assembly in parallel against each other. Tests in the same test collection will be run sequentially against each other, but tests in different test collections will be run in parallel against each other. Set this to false to disable all parallelization within this test assembly. --> | ||
<ParallelizeTestCollections>false</ParallelizeTestCollections> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I compared the CI run for this PR to a different CI run, because I would have expected that setting ParallelizeTestCollections and ParallelizeAssembly both to false would make the tests run very slowly. But somehow this PR's CI run was margionally faster then the dev branch run.
It's totally not intuitive to me why, but I didn't see this mentioned in the PR description or anywhere else, so I wanted to point out that apparently the CI is not going to take a long longer with this PR merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I struggle to understand it as well. Its still running tests assemblies in parallel with each other, but I guess not running tests themselves in parallel? A machine with only 2 cores would only be able to run two tests at the same time anyway so I think the adjustments don't affect the maximum parallelization we could already achieve on CI agents. It really just lowers the amount of parallelization that's queued up waiting for CPU.
Bug
Fixes: https://github.com/NuGet/Client.Engineering/issues/2501
Fixes: https://github.com/NuGet/Client.Engineering/issues/2866
Regression? Last working version:
Description
The Functional tests seem to very unreliable with a few symptoms that this pull request hopefully addresses:
To attempt to address the issues, I've tweaked the parallelism in the test runner. There are three settings in the
.runsettings
for parallelism:MaxCpuCount
vstest.console.exe
spawns to run tests inMaxParallelThreads
ParallelizeAssembly
true
false
Since we are currently using the default values for
MaxCpuCount
andMaxParallelThreads
, as well as enablingParallelizeAssembly
, I suspect thatvstest.console.exe
is launching 12 process, each one then runs the tests for each assembly with 12 threads, and if one process is running multiple assemblies, they're happening in parallel. My suspicion is that we're over parallelizing tests, leading to crashes. By changing the values to lower, I do not see any increase in test run time and in fact, sometimes the runs are faster. However, in my testing, I rarely see a test host process crashing anymore.I also deleted
build\TestShared\xunit.runner.json
and moved the settings tobuild\xunit.runsettings
so that everything is defined in one place. Some projects were opting into test parallelization with theUseParallelXunit
property which would copy thexunit.runner.json
to the output directory, while other projects were copying it themselves. I found it confusing to determine why some projects were allowed to run in parallel and others weren't so I changed how we run tests in parallel and made it uniform across the repo.Finally, I've disabled the crash dump helper when running tests since all of the memory dumps it provided were always for the main
vstest.console.exe
which wasn't helpful since the tests themselves are running in a different process. Instead, I've added the--diag
switch to the call todotnet test
and now there's a diagnostic log for the main process and all of the child processes with valuable information. These logs go to the binlog location and get uploaded when a test run fails or if you enable diagnostics on a particular pipeline run.PR Checklist
PR has a meaningful title
PR has a linked issue.
Described changes
Tests
Documentation