Poor unit test performance using py_test rule #311

cwaeland · 2020-05-14T15:48:06Z

I'm in the process of switching our rather large Python codebase over to using Bazel and have just finished getting our entire unit test suite to run using Bazel. Our tests are written using unittest and we leveraged pytest as our test runner. Typically a full run of the test suite (~6000 tests) would take around 10 minutes. However, with Bazel I'm now observing run times of about 40 minutes. Of course, once things are cached it is super fast.

I knew to expect some overhead but I haven't seen anything like this with other Bazel projects I've worked on.

I went ahead and collected some timings/profile information using the analyize-profile option that Bazel has for subsets of our unit tests. bazel clean was run before each timing run.

First for our web package: ./bazel test tests/python/web/.... This consist of 70 test targets. Profile information:

=== PHASE SUMMARY INFORMATION ===

Total launch phase time         6.702 s    2.35%
Total init phase time           0.267 s    0.09%
Total loading phase time        1.632 s    0.57%
Total analysis phase time       5.197 s    1.82%
Total preparation phase time    0.032 s    0.01%
Total execution phase time    271.452 s   95.15%
Total finish phase time         0.017 s    0.01%
------------------------------------------------
Total run time                285.301 s  100.00%

Critical path (165.526 s):
       Time Percentage   Description
     392 ms    0.24%   action 'Creating source manifest for //tests/python/web/controllers/api listing_check_ins_test'
   16.616 s   10.04%   action 'Creating runfiles tree bazel-out/darwin-py2-fastbuild/bin/tests/python/web/controllers/api/listing_check_ins_test.runfiles'
    12.5 ms    0.01%   runfiles for //tests/python/web/controllers/api listing_check_ins_test
  148.505 s   89.72%   action 'Testing //tests/python/web/controllers/api listing_check_ins_test'

So since this listing_check_ins_test (which is actually a slow test) seems to be the critical path I tried running it on its own:

=== PHASE SUMMARY INFORMATION ===

Total launch phase time         0.060 s    0.12%
Total init phase time           0.139 s    0.28%
Total loading phase time        1.616 s    3.28%
Total analysis phase time       4.551 s    9.25%
Total preparation phase time    0.017 s    0.04%
Total execution phase time     42.820 s   87.00%
Total finish phase time         0.016 s    0.03%
------------------------------------------------
Total run time                 49.221 s  100.00%

Critical path (42.567 s):
       Time Percentage   Description
     154 ms    0.36%   action 'Creating source manifest for //tests/python/web/controllers/api listing_check_ins_test'
     920 ms    2.16%   action 'Creating runfiles tree bazel-out/darwin-py2-fastbuild/bin/tests/python/web/controllers/api/listing_check_ins_test.runfiles'
    11.6 ms    0.03%   runfiles for //tests/python/web/controllers/api listing_check_ins_test
   41.481 s   97.45%   action 'Testing //tests/python/web/controllers/api listing_check_ins_test'

It seems noteworthy that when run by itself it's 3x faster.

At this point I wanted to compare to pytest.

When testing the web package using pytest we see the following result: 216 passed in 96.19 seconds

When running the single slow test with pytest: 11 passed in 32.79 seconds.

Interestingly when running the executable that Bazel produces directly (./listing_check_ins_test) I observe a similar time to that of pytest.

Ultimately Bazel profiling seems to indicate:

That the majority of time is spent in the execution phase.
Running tests in parallel seems to very negatively impact test speed.
At least what Bazel is outputting when run on its own has the same performance.

Any advice on how to continue to investigate this further would be appreciated.

The text was updated successfully, but these errors were encountered:

thundergolfer · 2020-05-23T06:13:43Z

Hey @cwaeland, thanks for the detail you've put in here. Unfortunately, I don't have any immediate advice. I've experienced slowness in py_test relative to using pytest or nose outside of Bazel, but caching has kept developer efficiency from decreasing on our (smaller) Python test suite.

pvcnt · 2020-08-28T13:03:30Z

I am using Bazel for a Python-only codebase that is fairly small, and I see large build times (not only tests, just building is slow). Right now it takes about 10 minutes to do a full build on CI, while there is only about 10-15 Python packages. For comparison, I am able to perform a full build on another repository with only Java code in about 4 minutes (the codebase is slightly larger). Again, I am only counting build time, not test time.

I don't have any precise figure, but I had the impression that it is related to the number and size of external dependencies. In particular, at some point, building the runfiles tree was taking forever. I would be very interested in understanding better the performance bottlenecks in Python…

github-actions · 2021-04-14T22:52:20Z

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

tjulinfan · 2022-09-15T09:08:38Z

Hi there, we are using python 3.8 with bazel, and we have the same experience that py_test is really slow, compared with native python run, it is slower 3x at least. Any idea or suggestions for this?

ramilmsh · 2023-07-11T10:00:35Z

@thundergolfer any updates?

RJPercival · 2023-10-06T07:39:40Z

As mentioned in the OP, this is probably a result of running tests in parallel. Some test suites, e.g. for Django, will create a database during setup. When you only invoke pytest once for the entire test suite, that cost is only paid once. If, like Bazel, you invoke pytest once per test, then all the setup, like creating a database, is multiplied by the number of tests. Could that explain the slowdown you're seeing?

ramilmsh · 2023-10-06T07:50:54Z

@RJPercival not in my case, because i run all tests in a service as 1 bazel test (something i didn't get around to fixing yet) and all the services are started in my CI pipeline before the build/test stage entirely (something i don't even know how to fix, because i haven't figured out how to run integration tests in bazel)

aignas · 2024-01-22T08:49:17Z

I wonder if #1653 is related? Shall we close this issue in favour of the newer one?

ramilmsh · 2024-01-22T09:04:08Z

@aignas i am not using a hermetic toolchain and i appear to experience the issue anyway. i can benchmark to know for sure, if that helps

cwaeland · 2024-01-22T16:15:57Z

@aignas the poor performance occurs even without using a hermetic toolchain. However, a hermetic toolchain would worsen it.

rickeylev · 2024-01-22T20:58:06Z

I think the thing to do is rule out "lots of files in runfiles" as the culprit (or otherwise isolate that from performance analysis). It's easy to end up needing thousands of files after adding a few third party dependencies, which basically re-creates the situation with the hermetic runtimes -- lots of files need to be materialized in the runfiles.

There's a couple known issues in Bazel that (1) this is a lot of duplicated work (the runtime and external deps are the same files for each target) and (2) macs in particular are harder hit by the cost of creating thousands of runfiles for some mac-specific reason.

rickeylev · 2024-07-17T20:17:48Z

FYI: I was told that the 7.2 release of Bazel contained some performance improvements to sandboxing, which should help with the overhead of creating files. There's also a flag that might make things faster: --experimental_inmemory_sandbox_stashes

cwaeland changed the title ~~Poor unit test performance using py_test~~ Poor unit test performance using py_test rule May 14, 2020

github-actions bot added the Can Close? Will close in 30 days if there is no new activity label Apr 14, 2021

thundergolfer removed the Can Close? Will close in 30 days if there is no new activity label Apr 28, 2021

thundergolfer self-assigned this Apr 28, 2021

aignas mentioned this issue Dec 21, 2023

rules_python is very very slow #1624

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor unit test performance using py_test rule #311

Poor unit test performance using py_test rule #311

cwaeland commented May 14, 2020

thundergolfer commented May 23, 2020

pvcnt commented Aug 28, 2020

github-actions bot commented Apr 14, 2021

tjulinfan commented Sep 15, 2022

ramilmsh commented Jul 11, 2023

RJPercival commented Oct 6, 2023

ramilmsh commented Oct 6, 2023

aignas commented Jan 22, 2024

ramilmsh commented Jan 22, 2024

cwaeland commented Jan 22, 2024

rickeylev commented Jan 22, 2024

rickeylev commented Jul 17, 2024

Poor unit test performance using py_test rule #311

Poor unit test performance using py_test rule #311

Comments

cwaeland commented May 14, 2020

thundergolfer commented May 23, 2020

pvcnt commented Aug 28, 2020

github-actions bot commented Apr 14, 2021

tjulinfan commented Sep 15, 2022

ramilmsh commented Jul 11, 2023

RJPercival commented Oct 6, 2023

ramilmsh commented Oct 6, 2023

aignas commented Jan 22, 2024

ramilmsh commented Jan 22, 2024

cwaeland commented Jan 22, 2024

rickeylev commented Jan 22, 2024

rickeylev commented Jul 17, 2024