Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testsession crashes after AbortTestRun() with message "Cancelling the operation as requested." #2578

Closed
richardwerkman opened this issue Sep 18, 2020 · 10 comments
Assignees

Comments

@richardwerkman
Copy link

richardwerkman commented Sep 18, 2020

Description

When working on Stryker mutator I noticed vstest is reporting some sort of error when running a session with two assemblies. At some point in stryker we know all mutations have been tested and we can abort/cancel the testrun. This goes right most of the times. Some times however a error with the message Cancelling the operation as requested. comes from vstest. After that the session seems to have crashed. No clear exception is thrown and it's not clear how to handle the error.

This might be a bug in vstest or I just don't know what the error means and how to get around the error.

Steps to reproduce

The error occurs when running stryker against two test projects.
1.
a. Clone Stryker-net
b. Rename dir to stryker-net-main
c. Checkout branch 1136_multi_test_project_bug
2.
a. Clone Stryker-net
b. Checkout branch multi_test_project_example
3. Open stryker-net-main (Solution Stryker.CLI.sln)
3. Run project Stryker.CLI in that solution with the following settings (Project Stryker.CLI -> Properties -> Debug)
Working directory: {path where you cloned}\stryker-net\src\Stryker.Core\Stryker.Core
Application arguments: -f
4. Set a breakpoint on this line in the clone from step 1
5. Run clone 1 with the settings as in step 3. The breakpoint should go off in a few minutes.

windows example for step 1&2:

git clone https://github.com/stryker-mutator/stryker-net.git
ren stryker-net stryker-net-main
cd ./stryker-net-main
git checkout 1136_multi_test_project_bug
cd ..
git clone https://github.com/stryker-mutator/stryker-net.git
cd ./stryker-net
git checkout multi_test_project_example

Expected behavior

I expect a clear error to occur or no error at all.

Actual behavior

Vstest sends a error message that we are unable to handle. If I throw an Exception (like in the reproduction code) vstest swallows the exception and doesn't let our own code handle it. I see no other way to detect the crashing of the testsession other than the message.

It would be great if someone could explain to me what the message means and why the vstest session crashes.

Diagnostic logs

As stryker runs a lot of testruns and testsessions it's not very helpful to send all the logs. Stryker puts them on file when the steps to reproduce are followed.

Environment

I ran on Windows 10 with vstest nuget package version 16.7.1.

@nohwnd
Copy link
Member

nohwnd commented Sep 29, 2020

@Sanan07 saw that you responded to other issues around run cancellation, could you have a look on this one please?

@Sanan07
Copy link
Contributor

Sanan07 commented Sep 29, 2020

@richardwerkman I tried to reproduce the issue as you described above, but got the different behavior.

It did not work for me as you wrote in 3 step : Application arguments: -f, I got an error, so I needed to specify working project.

So I changed application arguments to --project-file=C:\Work\Issues\2578\example\stryker-net\src\Stryker.RegexMutators\Stryker.RegexMutators\Stryker.RegexMutators.csproj.

I tried both projects : RegexMutator.csproj and DataCollector.csproj and both gave me the same result.

After successful run I got this message and breakpoint was not catched.

stry

@Sanan07 Sanan07 self-assigned this Oct 1, 2020
@richardwerkman
Copy link
Author

@Sanan07 Hi, I've updated the steps to reproduce. I just followed them myself and got the error this issue is about:
image

My guess is that you forgot to checkout the branches? If you run on master you won't get the error. And you have to clone twice as stryker cannot run on itself.

If you follow the steps exactly like I wrote them you should see the error too. Let me know if something is still unclear :)

@Sanan07
Copy link
Contributor

Sanan07 commented Oct 2, 2020

@richardwerkman Thanks, I was able to repro the issue, but after getting that exception StrykerCore continued working.
Could you please launch vstest console inside your application with /diag option (https://docs.microsoft.com/en-us/visualstudio/test/vstest-console-options?view=vs-2019) to get diagnostic logs, so we can see where the exception is throwing.

@richardwerkman
Copy link
Author

Great that you can now reproduce this.

Yes stryker keeps working, that is one of my concerns. I'll explain why: I throw an exception, but vstest seems to swallow it somehow? So I cant let the other code in stryker know the error occured. And after this error the vstest host has crashed. But stryker spins up as many vstest hosts as CPU's your PC has. So on large stryker runs the vstest hosts crash one by one until there are no hosts left and no more mutations are being tested.

This seems to only happen when multiple test assemblies are passed to vstest. And it seems to happen irregular.

I think the /diag option has to be passed here: https://github.com/stryker-mutator/stryker-net/blob/b67ddd7b25409207eb712d44410e62dc8f343c5c/src/Stryker.Core/Stryker.Core/TestRunners/VsTest/VsTestRunner.cs#L415

Could you try it yourself? Since I have no knowledge of what to do with the vstest logs.

@Sanan07
Copy link
Contributor

Sanan07 commented Oct 5, 2020

@richardwerkman I was able to run the app without the exception by removing timeout option from runsettings timeoutSettings.

@richardwerkman
Copy link
Author

Interesting. So could that mean the error is raised when a timeout is triggered? Removing the timeout option is not a solution for stryker btw. Some of the mutations we place cause endless loops to take place. If no timeout is passed, some testruns will just run forever and stryker will never finish.

But its good to know the issue is correlated to timeouts.

@Sanan07
Copy link
Contributor

Sanan07 commented Oct 5, 2020

@richardwerkman I decreased timeout to 1000ms and got very soon aborting (RunSettings description)
So you will need to determine timeout for each your test case for successful running then.

@Sanan07 Sanan07 closed this as completed Oct 12, 2020
@richardwerkman
Copy link
Author

Yes, indeed the error does not seem to occur when the timeout is drastically lowered. But that does not solve our issue... It just skips running the tests and marks all of our mutations as timeout. But we want to know if the tests succeeded to mark the mutants killed or survived.

Could you please open the issue again as it is not resolved yet.

@rouke-broersma
Copy link

rouke-broersma commented Dec 11, 2020

@Sanan07 The issue is that the vstest process never exits, even after the timeout we set has expired. Your decreasing of the timeout makes the tests time out yes, but that is not our goal.

We try to set a reasonable timeout for the testrun to account for any additional processing required due to our added mutations, this is not an extremely high timeout though.

You said before:

but after getting that exception StrykerCore continued working.

True, stryker continues to work because we use more than 1 vstest runner process. But we lose more and more processes due to vstest being stuck and sometimes we lose all of them and then stryker freezes. Also, any testrunner we lose will mean we may not have an accurate have a result for that testrun and will also mean that the mutation testrun becomes slower and slower due to losing concurrency. We would kill the vstest runner from our side and spawn a new one if we could detect this failure but the exception also never reaches us so we do not know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants