Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_gen.py extremely slow on GitHub Actions #946

Closed
pont-us opened this issue Mar 12, 2024 · 3 comments · Fixed by #980
Closed

test_gen.py extremely slow on GitHub Actions #946

pont-us opened this issue Mar 12, 2024 · 3 comments · Fixed by #980
Assignees

Comments

@pont-us
Copy link
Member

pont-us commented Mar 12, 2024

Describe the bug

GitHub Actions unit test runs are very slow (usually around 40 minutes, versus ~12 minutes per platform on AppVeyor and ~5 minutes on my local machine). Closer inspection of the logs reveals that around 35 of these 40 minutes are spent in one test module -- test/core/gen/test_gen.py. For instance, look at the logs for this test run:

Mon, 11 Mar 2024 17:19:43 GMT test/core/byoa/test_fileset.py ...........                               [  9%]
Mon, 11 Mar 2024 17:19:43 GMT test/core/gen/test_config.py ......                                      [  9%]
Mon, 11 Mar 2024 17:54:01 GMT test/core/gen/test_gen.py ...............                                [ 10%]
Mon, 11 Mar 2024 17:54:01 GMT test/core/gen/test_iproc.py .......                                      [ 11%]
Mon, 11 Mar 2024 17:54:07 GMT test/core/gen2/local/test_generator.py ......                            [ 11%]
Mon, 11 Mar 2024 17:54:07 GMT test/core/gen2/local/test_helpers.py ..                                  [ 11%]

To Reproduce
Steps to reproduce the behavior:

  1. Run the xcube ‘Unittest and docker builds’ workflow in GitHub Actions, or look at one of the previous runs.
  2. Display the logs for the unittest step and activate timestamps.
  3. Observe that test_gen.py takes over half an hour.

Expected behavior
test_gen.py on GHA completes in seconds or tens of seconds (as it already does on other platforms), rather than in tens of minutes.

@pont-us pont-us self-assigned this Mar 12, 2024
@konstntokas
Copy link
Contributor

konstntokas commented May 15, 2024

I profiled the module test/core/gen/test_gen.py and compared the results in CI pipeline and on the local machine.

github pipeline profiling:
Screenshot from 2024-05-15 08-29-53

local profiling:
Screenshot from 2024-05-15 08-51-18
Screenshot from 2024-05-15 08-51-43

What did I notice so far:

  • the fuction _compute_ij_images_for_source_line() in xcube/core/resampling/rectify.py consumes a lot of time which is not shown on the profiling result on the local machine for the first 1000 entries. Numba is used to parallelize the for loop. I assume that something goes wrong with numba in the CI pipeline.

@konstntokas
Copy link
Contributor

In the next step I remove numba in the xcube/core/resampling/rectify.py

github pipeline profiling:
Screenshot from 2024-05-15 11-09-40

Local profiling
Screenshot from 2024-05-15 11-10-05
Screenshot from 2024-05-15 11-10-22

What did I notice:

  • The CI pipeline is not slower
  • The local testing got a lot slower and consumes now a lot of time in _compute_ij_images_for_source_line().

Conclusion
Something goes wrong with the parallelization using numba -> I will investigate that further

@konstntokas
Copy link
Contributor

Cause
NUMBA_DISABLE_JIT is set to 1 in xcube_workflow.yaml#L17, which disables numba.jit.

Solution

  • set NUMBER_DISABLE_JIT: 0
  • remove env: NUMBA_DISABLE_JIT: 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants