Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pitest JUnit 5 runs all parameterized tests despite fullMutationMatrix set to false #14

Open
dmitry-timofeev opened this issue Oct 13, 2018 · 8 comments

Comments

@dmitry-timofeev
Copy link
Contributor

dmitry-timofeev commented Oct 13, 2018

Pitest with JUnit 5 does not seem to respect fullMutationMatrix parameter when runs parameterized tests.

Аccording to the output, Pitest keeps running the parameterized tests that may kill the mutant even if one of them has already killed one:

1:42:26 PM PIT >> FINE : Mutation MutationIdentifier 
[location=Location [clazz=dt.pitest.Calculator, method=increment, methodDesc=(I)I], 
indexes=[6],
mutator=org.pitest.mutationtest.engine.gregor.mutators.ReturnValsMutator]
detected = KILLED by [dt.pitest.stderr  : CalculatorTest.[1] -1, 0,
  dt.pitest.CalculatorTest.[2] 0, 1,
  dt.pitest.CalculatorTest.[3] 1, 2,
  dt.pitest.CalculatorTest.[4] 2, 3,
  dt.pitest.CalculatorTest.[5] 3, 4
]
@dmitry-timofeev
Copy link
Contributor Author

Here is a project reproducing the issue: https://github.com/dmitry-timofeev/pitest-with-junit5-issue/tree/master

@hcoles
Copy link
Contributor

hcoles commented Oct 13, 2018

Thanks for the report.

This is currently expected behaviour for the junit 5 plugin - it all boils down to what you consider a "test" to be.

JUnit 4 (and now 5) treats each test class as the basic atomic unit of testiness. So all (or none) or the tests in a class are run. Pitest's junit 4 implementation goes to a lot of effort to break tests down into small parts (using junit 4's filter mechanism), so that (depending on the runner) each test method or parameter runs independently, and can be targeted at a mutant independently of the other "tests" in a class.

This introduces a lot of complexity and is one of the main causes of tests that run green normally, but run red for pitest. Sometimes the overhead it adds is more than the savings introduced by running fewer tests methods, sometimes the savings far outweigh the overhead.

The junit 5 plugin does not currently try and break tests down into smaller parts. I've not looked into how things hang together in junit 5, but I would be reluctant to try to do this unless it can be done much more cleanly than was the case for junit 4.

A different, but related, optimisation that pitest performs is to exit out of a running test once a failure has been detected. Again I've not looked at the details of how this might be done in junit 5, but implementing this would probably add less complexity.

@dmitry-timofeev
Copy link
Contributor Author

Thank you for the explanation,
To confirm I understand this correctly, is the main challenge

  1. Collecting coverage information per test invocation or
  2. Selecting individual test invocations to run with mutants based on the coverage information?

I have touched JUnit 5 extension mechanism just a little, but it seems quite rich. Here are some of things that might be relevant to this issue:

  • Each test container (like a class, a dynamic or a parameterized test) and each test invocation have a unique identifier. A class discovering tests allows to select them based on their unique identifiers. The questions is if they are stable for an immutable test class.
  • A launcher allows to register test execution listeners, which are notified of the result of each test. Another way to get notified of test execution results is to register a lifecycle extension.
  • Haven't found a direct way to abort execution of launched tests, but there is also an extension mechanism to filter already selected tests (that's how @Disabled is implemented): conditional test execution. The question is if it makes sense to select tests at the highest granularity (i.e., creating a launcher for each test invocation that may kill a mutant), or to select all that may kill, and disable remaining tests when one has killed a mutant.

@dmitry-timofeev
Copy link
Contributor Author

A different, but related, optimisation that pitest performs is to exit out of a running test once a failure has been detected. Again I've not looked at the details of how this might be done in junit 5, but implementing this would probably add less complexity.

That must be the original issue that I've noticed and put in a reproducing project, where all the test invocations may kill and do kill a mutant. It does not at the moment represent the case when some of tests do not cover the mutated line of code. Shall this issue be split?

@hcoles
Copy link
Contributor

hcoles commented Oct 14, 2018

@dmitry-timofeev

  1. and 2) are the same thing (or at least the least complex soloution makes them the same thing)

Pitest has the concept of a "TestUnit" this is the smallest independently executable part of a test. It is a thing you can map to line coverage, and it is a thing you can execute independently against a mutant.

For JUnit 5 this is currently all tests that are defined within a class.

For Junit 4 it is all tests that are defined within a class, or each individual method if the runner implements filterable, unless it has a before or after class annotation in which case it's all tests in the class again, or a class rule in which case it's all the tests in the class etc etc. (it gets messy and complex).

For TestNg it is all the tests defined in a single class. There is a mechanism by which testng tests could be broken down into smaller units, but the performance overhead is so high that there is no point in doing it.

As a compromise, once one test within a testng test class has failed the adapter throws a "skip test" exception for all the following tests.

So, in answer to your second question this probably shouldn't be pulled out a seperate issue as the junit 5 plugin will need to do one thing or the other (or neither).

My preference is to follow the much less complex testng model as the junit 4 approach has caused constant pain. It will all depend on the details of how things are done in junit 5 though, and I've not had chance to look.

@dmitry-timofeev
Copy link
Contributor Author

For JUnit 5 this is currently all tests that are defined within a class.

It seems that a JUnit 5 source method corresponds to a Pitest TestUnit (I've verified that with a unit test confirming such behaviour, will submit in a PR later):
https://github.com/pitest/pitest-junit5-plugin/blob/master/src/main/java/org/pitest/junit5/JUnit5TestUnitFinder.java#L56-L59

A source method is either an individual test (e.g., @Test), a test container (e.g., @ParameterizedTest, @RepeatedTest), or a dynamic test container (aka a test factory — @TestFactory).

It makes pitest-junit5-plugin operate on a fine granularity of individual tests or a coarser one of test containers. I guess JUnit Jupiter Engine handles running individual tests correctly (including cases when they are in a class with static lifecycle callbacks) — or there are just no tests of that :-)

Does this make the issue of aborting tests once a failing one is discovered less severe (unless someone uses @TestFactory, or @ParameterizedTest with lots of tests or some expensive tests)?

For TestNg it is all the tests defined in a single class. There is a mechanism by which testng tests could be broken down into smaller units, but the performance overhead is so high that there is no point in doing it.

It's an interesting question if it is beneficial to run individual tests vs all-in-a-class — your data of TestNG results suggests it might not always be the case (and, probably, there is no easy answer — it will likely differ based on the actual test class).

@dmitry-timofeev
Copy link
Contributor Author

dmitry-timofeev commented Oct 14, 2018

I've built a prototype — it was not as straightforward as I expected because extensions are loaded as service providers and instantiated by the engine (JUnit Jupiter Engine in our case), hence there is no way to inject an arbitrary object into an extension.

I've got the following results on a module with many but fast parameterized tests:

  • Same number of killed (as expected),
  • ~40% less executed tests (is it in TestUnits or in individual invocations?),
  • but just 2 seconds faster
0.7

================================================================================
- Timings
================================================================================
> scan classpath : < 1 second
> coverage and dependency analysis : 3 seconds
> build mutation tests : < 1 second
> run mutation analysis : 35 seconds
--------------------------------------------------------------------------------
> Total  : 39 seconds
--------------------------------------------------------------------------------
================================================================================
- Statistics
================================================================================
>> Generated 197 mutations Killed 163 (83%)
>> Ran 28048 tests (142.38 tests per mutation)

0.8-SNAPSHOT-patched

================================================================================
- Timings
================================================================================
> scan classpath : < 1 second
> coverage and dependency analysis : 3 seconds
> build mutation tests : < 1 second
> run mutation analysis : 33 seconds
--------------------------------------------------------------------------------
> Total  : 37 seconds
--------------------------------------------------------------------------------
================================================================================
- Statistics
================================================================================
>> Generated 197 mutations Killed 163 (83%)
>> Ran 10596 tests (53.79 tests per mutation)

@dmitry-timofeev
Copy link
Contributor Author

dmitry-timofeev commented Oct 15, 2018

Managed to get profile of each of the minions that Pitest creates with https://github.com/jvm-profiling-tools/async-profiler. Here is one of them as an interactive flame graph:
profile-1055927544.svg.zip

Some observations for that particular minion:

  • Almost 50% of the time JVM compiles methods
  • Discovery takes ~3%
  • Test execution takes ~ 26%
    • Actual tests (i.e., the test code) takes ~ 1%

Collecting the profiles required some hacks because pitest runs them in separate processes — it would be great to have an opportunity to use wildcards in JVM args that pitest replaces with, say, the minion identifier. For example, <jvmArgs><arg>-DlogFile=events-{minion.name}.log</arg></jvmArgs with {minion.name} replaced with its unique human readable identifier. That will allow using various profiling tools with no patches to pitest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants