Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve sharding of parameterized tests #1720

Closed
valeraz opened this issue Mar 22, 2021 · 3 comments
Closed

Improve sharding of parameterized tests #1720

valeraz opened this issue Mar 22, 2021 · 3 comments

Comments

@valeraz
Copy link
Contributor

valeraz commented Mar 22, 2021

As a flank setup maintainer at Slack, I would like flank to improve its handing of parameterized tests (on Android), so that we can avoid unexpected increases in test suite execution time.

This came up recently because, in our weekly metrics meeting, we noticed a significant increase in the execution time of the CI job that runs instrumentation tests. Upon, closer examination, we noticed that the PR when it started included a new Parameterized test. Flank currently treats parameterized tests as any other test methods, so it got included in a shard as if it were 1 test class with multiple test methods. At runtime, this expanded to many more tests, causing this one shard to execute for much longer (i.e. we had an imbalance in sharding). Ideally, flank would "know" how many test a parameterized test expands to and accouts for that in its sharding.

Describe the solution
@bootstraponline has suggested: "some intelligent preprocessing of the AST and figure out what the parameterized tests will expand to."

Describe alternatives considered

  • Perhaps an easier one would be to place each parameterized test class in its own shard? (as a config option)
  • I also wonder whether we should consider having an option to prevent parameterized tests from being added to a test execution: in theory these types of tests (targeting many permutations) should live in jvm-land. In our particular case, the test class was easy to move to a jvm test.
@asadsalman
Copy link
Contributor

asadsalman commented Apr 6, 2021

I have also ran into something similar. One of our parameterized test class took 10 minutes more than any other shard. While I understand the limitations of sharding parameterized classes, it took a long time to figure out what was happening at test level. Neither FTL nor Flank show class-level durations, so it wasn't until I wrote a small JUnit XML parser did I realize that one class is taking 10mins+.

While a longer term fix to parameterized classes may be harder to implement, printing some warning about parameterized classes taking a long time (compared to other shards etc.) would help narrow things down.

Thoughts @bootstraponline @jan-gogo?

@bootstraponline
Copy link
Contributor

bootstraponline commented Apr 7, 2021

I agree some type of warning or reporting would be good. Maybe adding a report that includes test time by class, and note the parameterized classes?

In the long term I think we'll likely move to runtime discovery of tests in the new corellium backend. That should enable us to get a complete picture of what tests exist. Parsing out the info from bytecode has limitations.

The output of this task should be a few proof of concepts of different approaches and a doc which summarizes the tradeoffs.

@Sloox
Copy link
Contributor

Sloox commented Jun 10, 2021

Improve Sharding of Parameterized tests

Flank users want flank to make use of parameterized tests in the correct manner and provide more information about the use of parameterized tests.

References

Motivation

Based on the discussion found in 1720. Flank needs to improve its handling of parameterized tests as making use of them can significantly increase test execution time.

Goals

Improving support for parameterized tests by investigating/implementing the following:

  1. Provide information and warning of use for parameterized tests on flank.
  2. Add options for parameterized tests that help with their usage. It can include ignoring, place in seperate shard and possibly smart flank parameterization.
    • intelligent pre-processing of the AST & figuring out how to handle parameterized tests should be investigated seperatly due to possible complexity
  3. Investigate and add more types of reporting to flank so it can help narrow down issues such that parameterized tests created eg, test time by class or by test.

Finally a recommendation as to what path to take regarding the above goals.

Design

Flank already partially supports Parameterized tests and as per discussion it adds all the iterations for the test into one single shard, which can dramatically increase the time for that particular shard.

  1. Therefore there should be a information/warning test notifying the user of the possible conseqeunce as such.
  2. Add options for parameterized tests that will filter them to either:
    • ignore-all - Ignore parameterized tests and do not add them
    • collect-all-single - Collect all the parameterized tests and run in a single shard
    • collect-all-multiple - Smart collection all the parameterized test and run seperatly on a per case basis
  3. Investigate and see if its possible to add more types of time reporting to flank that include duration tracking at class-level, test-level

Implementation

Implementation will be broken up into the viability of whether it can be easily implemented or not.

  1. Adding information and or warning about test notifying the user about Parameterized tests is a simple task and most likely can be achieved via println within CreateAndroidTestContext

    • Documents should be updated in flank to highlight the issue associated with Parameterized tests
  2. Adding any new option will require modification to:

    • AndroidArgs
      Alongside other changes that will add the new option\s for use
      A suggested name and description
### Parameterized Tests
  ## Specifies how to handle tests which contain Parameterization.
  ## 4 options are available
  ## default: treat Parameterized tests as normal and shard accordingly
  ## ignore-all: Parameterized tests are ignored and not sharded
  ## collect-all-single: Parameterized tests are collected and put into a single shard
  ## collect-all-multiple: Smart collection of all the parameterized test and run seperatly on a per case basis
  ## Note: if left blank default is used.
  # parameterized-tests: default

The addition of the options will be split into another ticket for investigation as it is a fairly complex task.

  1. Investigation and addition of more timing reporting to flank requires modification and additions to the work done in 1666

Dependencies

There are no changes

Testing

The solution should be unit tested and integration tested accordingly as it adds new features to flank.

Conclustion/ Recommendation

The recommended course of action is to tackle the tasks as follows:

  1. Add information/notification of the Parameterized tests : Parameterized Tests: Add Notification & Documentation  #2022
  2. Add support for new options ignore-all, collect-all-single : Parameterized Tests: Add new options #1 #2023
  3. Investigate and add support if possible for collect-all-multiple : Parameterized Tests: Add new options #2 #2024
  4. Add timing support : Parameterized Tests: Add more timings #2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants