Improve sharding of parameterized tests #1720

valeraz · 2021-03-22T19:17:11Z

As a flank setup maintainer at Slack, I would like flank to improve its handing of parameterized tests (on Android), so that we can avoid unexpected increases in test suite execution time.

This came up recently because, in our weekly metrics meeting, we noticed a significant increase in the execution time of the CI job that runs instrumentation tests. Upon, closer examination, we noticed that the PR when it started included a new Parameterized test. Flank currently treats parameterized tests as any other test methods, so it got included in a shard as if it were 1 test class with multiple test methods. At runtime, this expanded to many more tests, causing this one shard to execute for much longer (i.e. we had an imbalance in sharding). Ideally, flank would "know" how many test a parameterized test expands to and accouts for that in its sharding.

Describe the solution
@bootstraponline has suggested: "some intelligent preprocessing of the AST and figure out what the parameterized tests will expand to."

Describe alternatives considered

Perhaps an easier one would be to place each parameterized test class in its own shard? (as a config option)
I also wonder whether we should consider having an option to prevent parameterized tests from being added to a test execution: in theory these types of tests (targeting many permutations) should live in jvm-land. In our particular case, the test class was easy to move to a jvm test.

asadsalman · 2021-04-06T23:50:36Z

I have also ran into something similar. One of our parameterized test class took 10 minutes more than any other shard. While I understand the limitations of sharding parameterized classes, it took a long time to figure out what was happening at test level. Neither FTL nor Flank show class-level durations, so it wasn't until I wrote a small JUnit XML parser did I realize that one class is taking 10mins+.

While a longer term fix to parameterized classes may be harder to implement, printing some warning about parameterized classes taking a long time (compared to other shards etc.) would help narrow things down.

Thoughts @bootstraponline @jan-gogo?

bootstraponline · 2021-04-07T00:42:51Z

I agree some type of warning or reporting would be good. Maybe adding a report that includes test time by class, and note the parameterized classes?

In the long term I think we'll likely move to runtime discovery of tests in the new corellium backend. That should enable us to get a complete picture of what tests exist. Parsing out the info from bytecode has limitations.

The output of this task should be a few proof of concepts of different approaches and a doc which summarizes the tradeoffs.

Sloox · 2021-06-10T08:50:09Z

Improve Sharding of Parameterized tests

Flank users want flank to make use of parameterized tests in the correct manner and provide more information about the use of parameterized tests.

References

Improve sharding of parameterized tests #1720

Motivation

Based on the discussion found in 1720. Flank needs to improve its handling of parameterized tests as making use of them can significantly increase test execution time.

Goals

Improving support for parameterized tests by investigating/implementing the following:

Provide information and warning of use for parameterized tests on flank.
Add options for parameterized tests that help with their usage. It can include ignoring, place in seperate shard and possibly smart flank parameterization.
- intelligent pre-processing of the AST & figuring out how to handle parameterized tests should be investigated seperatly due to possible complexity
Investigate and add more types of reporting to flank so it can help narrow down issues such that parameterized tests created eg, test time by class or by test.

Finally a recommendation as to what path to take regarding the above goals.

Design

Flank already partially supports Parameterized tests and as per discussion it adds all the iterations for the test into one single shard, which can dramatically increase the time for that particular shard.

Therefore there should be a information/warning test notifying the user of the possible conseqeunce as such.
Add options for parameterized tests that will filter them to either:
- ignore-all - Ignore parameterized tests and do not add them
- collect-all-single - Collect all the parameterized tests and run in a single shard
- collect-all-multiple - Smart collection all the parameterized test and run seperatly on a per case basis
Investigate and see if its possible to add more types of time reporting to flank that include duration tracking at class-level, test-level

Implementation

Implementation will be broken up into the viability of whether it can be easily implemented or not.

Adding information and or warning about test notifying the user about Parameterized tests is a simple task and most likely can be achieved via println within CreateAndroidTestContext
- Documents should be updated in flank to highlight the issue associated with Parameterized tests
Adding any new option will require modification to:
- AndroidArgs
  Alongside other changes that will add the new option\s for use
  A suggested name and description

### Parameterized Tests
  ## Specifies how to handle tests which contain Parameterization.
  ## 4 options are available
  ## default: treat Parameterized tests as normal and shard accordingly
  ## ignore-all: Parameterized tests are ignored and not sharded
  ## collect-all-single: Parameterized tests are collected and put into a single shard
  ## collect-all-multiple: Smart collection of all the parameterized test and run seperatly on a per case basis
  ## Note: if left blank default is used.
  # parameterized-tests: default

The addition of the options will be split into another ticket for investigation as it is a fairly complex task.

Investigation and addition of more timing reporting to flank requires modification and additions to the work done in 1666

Dependencies

There are no changes

Testing

The solution should be unit tested and integration tested accordingly as it adds new features to flank.

Conclustion/ Recommendation

The recommended course of action is to tackle the tasks as follows:

Add information/notification of the Parameterized tests : Parameterized Tests: Add Notification & Documentation #2022
Add support for new options ignore-all, collect-all-single : Parameterized Tests: Add new options #1 #2023
Investigate and add support if possible for collect-all-multiple : Parameterized Tests: Add new options #2 #2024
Add timing support : Parameterized Tests: Add more timings #2025

valeraz added the Feature label Mar 22, 2021

zuziaka added Research SDD Tiger 🐯 labels May 24, 2021

Sloox self-assigned this Jun 2, 2021

This was referenced Jun 10, 2021

Parameterized Tests: Add Notification & Documentation #2022

Closed

Parameterized Tests: Add new options #1 #2023

Closed

Parameterized Tests: Add new options #2 #2024

Closed

Parameterized Tests: Add more timings #2025

Closed

Sloox added the Epic label Jun 17, 2021

Sloox mentioned this issue Jun 17, 2021

feat: Added new Parameterized Test Option #2035

Merged

2 tasks

Sloox closed this as completed Jul 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve sharding of parameterized tests #1720

Improve sharding of parameterized tests #1720

valeraz commented Mar 22, 2021

asadsalman commented Apr 6, 2021 •

edited

Loading

bootstraponline commented Apr 7, 2021 •

edited

Loading

Sloox commented Jun 10, 2021 •

edited

Loading

Improve sharding of parameterized tests #1720

Improve sharding of parameterized tests #1720

Comments

valeraz commented Mar 22, 2021

asadsalman commented Apr 6, 2021 • edited Loading

bootstraponline commented Apr 7, 2021 • edited Loading

Sloox commented Jun 10, 2021 • edited Loading

Improve Sharding of Parameterized tests

References

Motivation

Goals

Design

Implementation

Dependencies

Testing

Conclustion/ Recommendation

asadsalman commented Apr 6, 2021 •

edited

Loading

bootstraponline commented Apr 7, 2021 •

edited

Loading

Sloox commented Jun 10, 2021 •

edited

Loading