Pitest JUnit 5 runs all parameterized tests despite fullMutationMatrix set to false #14

dmitry-timofeev · 2018-10-13T10:47:27Z

Pitest with JUnit 5 does not seem to respect fullMutationMatrix parameter when runs parameterized tests.

Аccording to the output, Pitest keeps running the parameterized tests that may kill the mutant even if one of them has already killed one:

1:42:26 PM PIT >> FINE : Mutation MutationIdentifier 
[location=Location [clazz=dt.pitest.Calculator, method=increment, methodDesc=(I)I], 
indexes=[6],
mutator=org.pitest.mutationtest.engine.gregor.mutators.ReturnValsMutator]
detected = KILLED by [dt.pitest.stderr  : CalculatorTest.[1] -1, 0,
  dt.pitest.CalculatorTest.[2] 0, 1,
  dt.pitest.CalculatorTest.[3] 1, 2,
  dt.pitest.CalculatorTest.[4] 2, 3,
  dt.pitest.CalculatorTest.[5] 3, 4
]

The text was updated successfully, but these errors were encountered:

dmitry-timofeev · 2018-10-13T10:49:21Z

Here is a project reproducing the issue: https://github.com/dmitry-timofeev/pitest-with-junit5-issue/tree/master

hcoles · 2018-10-13T20:34:40Z

Thanks for the report.

This is currently expected behaviour for the junit 5 plugin - it all boils down to what you consider a "test" to be.

JUnit 4 (and now 5) treats each test class as the basic atomic unit of testiness. So all (or none) or the tests in a class are run. Pitest's junit 4 implementation goes to a lot of effort to break tests down into small parts (using junit 4's filter mechanism), so that (depending on the runner) each test method or parameter runs independently, and can be targeted at a mutant independently of the other "tests" in a class.

This introduces a lot of complexity and is one of the main causes of tests that run green normally, but run red for pitest. Sometimes the overhead it adds is more than the savings introduced by running fewer tests methods, sometimes the savings far outweigh the overhead.

The junit 5 plugin does not currently try and break tests down into smaller parts. I've not looked into how things hang together in junit 5, but I would be reluctant to try to do this unless it can be done much more cleanly than was the case for junit 4.

A different, but related, optimisation that pitest performs is to exit out of a running test once a failure has been detected. Again I've not looked at the details of how this might be done in junit 5, but implementing this would probably add less complexity.

dmitry-timofeev · 2018-10-14T10:51:37Z

Thank you for the explanation,
To confirm I understand this correctly, is the main challenge

Collecting coverage information per test invocation or
Selecting individual test invocations to run with mutants based on the coverage information?

I have touched JUnit 5 extension mechanism just a little, but it seems quite rich. Here are some of things that might be relevant to this issue:

Each test container (like a class, a dynamic or a parameterized test) and each test invocation have a unique identifier. A class discovering tests allows to select them based on their unique identifiers. The questions is if they are stable for an immutable test class.
A launcher allows to register test execution listeners, which are notified of the result of each test. Another way to get notified of test execution results is to register a lifecycle extension.
Haven't found a direct way to abort execution of launched tests, but there is also an extension mechanism to filter already selected tests (that's how @Disabled is implemented): conditional test execution. The question is if it makes sense to select tests at the highest granularity (i.e., creating a launcher for each test invocation that may kill a mutant), or to select all that may kill, and disable remaining tests when one has killed a mutant.

dmitry-timofeev · 2018-10-14T10:53:17Z

A different, but related, optimisation that pitest performs is to exit out of a running test once a failure has been detected. Again I've not looked at the details of how this might be done in junit 5, but implementing this would probably add less complexity.

That must be the original issue that I've noticed and put in a reproducing project, where all the test invocations may kill and do kill a mutant. It does not at the moment represent the case when some of tests do not cover the mutated line of code. Shall this issue be split?

hcoles · 2018-10-14T15:43:30Z

@dmitry-timofeev

and 2) are the same thing (or at least the least complex soloution makes them the same thing)

Pitest has the concept of a "TestUnit" this is the smallest independently executable part of a test. It is a thing you can map to line coverage, and it is a thing you can execute independently against a mutant.

For JUnit 5 this is currently all tests that are defined within a class.

For Junit 4 it is all tests that are defined within a class, or each individual method if the runner implements filterable, unless it has a before or after class annotation in which case it's all tests in the class again, or a class rule in which case it's all the tests in the class etc etc. (it gets messy and complex).

For TestNg it is all the tests defined in a single class. There is a mechanism by which testng tests could be broken down into smaller units, but the performance overhead is so high that there is no point in doing it.

As a compromise, once one test within a testng test class has failed the adapter throws a "skip test" exception for all the following tests.

So, in answer to your second question this probably shouldn't be pulled out a seperate issue as the junit 5 plugin will need to do one thing or the other (or neither).

My preference is to follow the much less complex testng model as the junit 4 approach has caused constant pain. It will all depend on the details of how things are done in junit 5 though, and I've not had chance to look.

dmitry-timofeev · 2018-10-14T20:10:37Z

For JUnit 5 this is currently all tests that are defined within a class.

It seems that a JUnit 5 source method corresponds to a Pitest TestUnit (I've verified that with a unit test confirming such behaviour, will submit in a PR later):
https://github.com/pitest/pitest-junit5-plugin/blob/master/src/main/java/org/pitest/junit5/JUnit5TestUnitFinder.java#L56-L59

A source method is either an individual test (e.g., @Test), a test container (e.g., @ParameterizedTest, @RepeatedTest), or a dynamic test container (aka a test factory — @TestFactory).

It makes pitest-junit5-plugin operate on a fine granularity of individual tests or a coarser one of test containers. I guess JUnit Jupiter Engine handles running individual tests correctly (including cases when they are in a class with static lifecycle callbacks) — or there are just no tests of that :-)

Does this make the issue of aborting tests once a failing one is discovered less severe (unless someone uses @TestFactory, or @ParameterizedTest with lots of tests or some expensive tests)?

For TestNg it is all the tests defined in a single class. There is a mechanism by which testng tests could be broken down into smaller units, but the performance overhead is so high that there is no point in doing it.

It's an interesting question if it is beneficial to run individual tests vs all-in-a-class — your data of TestNG results suggests it might not always be the case (and, probably, there is no easy answer — it will likely differ based on the actual test class).

dmitry-timofeev · 2018-10-14T20:20:42Z

I've built a prototype — it was not as straightforward as I expected because extensions are loaded as service providers and instantiated by the engine (JUnit Jupiter Engine in our case), hence there is no way to inject an arbitrary object into an extension.

I've got the following results on a module with many but fast parameterized tests:

Same number of killed (as expected),
~40% less executed tests (is it in TestUnits or in individual invocations?),
but just 2 seconds faster

0.7

================================================================================
- Timings
================================================================================
> scan classpath : < 1 second
> coverage and dependency analysis : 3 seconds
> build mutation tests : < 1 second
> run mutation analysis : 35 seconds
--------------------------------------------------------------------------------
> Total  : 39 seconds
--------------------------------------------------------------------------------
================================================================================
- Statistics
================================================================================
>> Generated 197 mutations Killed 163 (83%)
>> Ran 28048 tests (142.38 tests per mutation)

0.8-SNAPSHOT-patched

================================================================================
- Timings
================================================================================
> scan classpath : < 1 second
> coverage and dependency analysis : 3 seconds
> build mutation tests : < 1 second
> run mutation analysis : 33 seconds
--------------------------------------------------------------------------------
> Total  : 37 seconds
--------------------------------------------------------------------------------
================================================================================
- Statistics
================================================================================
>> Generated 197 mutations Killed 163 (83%)
>> Ran 10596 tests (53.79 tests per mutation)

dmitry-timofeev · 2018-10-15T12:13:09Z

Managed to get profile of each of the minions that Pitest creates with https://github.com/jvm-profiling-tools/async-profiler. Here is one of them as an interactive flame graph:
profile-1055927544.svg.zip

Some observations for that particular minion:

Almost 50% of the time JVM compiles methods
Discovery takes ~3%
Test execution takes ~ 26%
- Actual tests (i.e., the test code) takes ~ 1%

Collecting the profiles required some hacks because pitest runs them in separate processes — it would be great to have an opportunity to use wildcards in JVM args that pitest replaces with, say, the minion identifier. For example, <jvmArgs><arg>-DlogFile=events-{minion.name}.log</arg></jvmArgs with {minion.name} replaced with its unique human readable identifier. That will allow using various profiling tools with no patches to pitest.

dmitry-timofeev mentioned this issue Oct 14, 2018

WIP Enable filtering of killing test units #15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pitest JUnit 5 runs all parameterized tests despite fullMutationMatrix set to false #14

Pitest JUnit 5 runs all parameterized tests despite fullMutationMatrix set to false #14

dmitry-timofeev commented Oct 13, 2018 •

edited

Loading

dmitry-timofeev commented Oct 13, 2018

hcoles commented Oct 13, 2018

dmitry-timofeev commented Oct 14, 2018

dmitry-timofeev commented Oct 14, 2018

hcoles commented Oct 14, 2018

dmitry-timofeev commented Oct 14, 2018

dmitry-timofeev commented Oct 14, 2018 •

edited

Loading

dmitry-timofeev commented Oct 15, 2018 •

edited

Loading

Pitest JUnit 5 runs all parameterized tests despite fullMutationMatrix set to false #14

Pitest JUnit 5 runs all parameterized tests despite fullMutationMatrix set to false #14

Comments

dmitry-timofeev commented Oct 13, 2018 • edited Loading

dmitry-timofeev commented Oct 13, 2018

hcoles commented Oct 13, 2018

dmitry-timofeev commented Oct 14, 2018

dmitry-timofeev commented Oct 14, 2018

hcoles commented Oct 14, 2018

dmitry-timofeev commented Oct 14, 2018

dmitry-timofeev commented Oct 14, 2018 • edited Loading

dmitry-timofeev commented Oct 15, 2018 • edited Loading

dmitry-timofeev commented Oct 13, 2018 •

edited

Loading

dmitry-timofeev commented Oct 14, 2018 •

edited

Loading

dmitry-timofeev commented Oct 15, 2018 •

edited

Loading