`@batch` slows down other non-`@batch`ed loops with allocations on macOS ARM

Some of my simulations are regularly stopping for about a second when using `@batch` on macOS ARM.
I could reduce this problem to this minimal example, but I am now clueless how to continue.

```julia
using Polyester


function with_batch()
    # Just some loop with @batch with basically no runtime
    @batch for i in 1:2
        nothing
    end

    # This is just to make sure that the allocation in the next loop is not optimized away
    v = [[]]

    # Note that there is no @batch here
    for i in 1:1000
        # Just an allocation
        v[1] = []
    end
end

function without_batch()
    for i in 1:2
        nothing
    end

    v = [[]]

    for i in 1:1000
        v[1] = []
    end
end
```
Benchmarking yields the following:
```
julia> @benchmark with_batch()
BenchmarkTools.Trial: 8709 samples with 1 evaluation.
 Range (min … max):   16.416 μs …   1.404 s  ┊ GC (min … max): 0.00% … 0.47%
 Time  (median):      18.041 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   663.460 μs ± 30.068 ms  ┊ GC (mean ± σ):  0.41% ± 0.01%

         ▁▁▄▇█▇▅▃▁▁ ▁▂▂▃▄▄▄▂▃▃▃▃▃▃▄▄▃▄▃▂▁      ▁               ▂
  ▂▂▃▅▅▇███████████████████████████████████████████████████▇▆▆ █
  16.4 μs       Histogram: log(frequency) by time      23.6 μs <

 Memory estimate: 46.98 KiB, allocs estimate: 1002.

julia> @benchmark without_batch()
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  14.625 μs …   5.596 ms  ┊ GC (min … max):  0.00% … 99.31%
 Time  (median):     15.166 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   18.275 μs ± 110.414 μs  ┊ GC (mean ± σ):  12.03% ±  1.98%

     █▇▃                                                        
  ▂▂▅███▆▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▅▃▂▂▃▂▂▂▂▂▂▃▅▆▄▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  14.6 μs         Histogram: frequency by time         19.8 μs <

 Memory estimate: 46.98 KiB, allocs estimate: 1002.
```
About one execution out of 2000 takes over one second, which causes the mean to be 30x higher than without any `@batch` loops. This is consistent with what I see in simulations, where most time steps are fast, but then some take over a second.

This problem is specific to macOS ARM. The same Julia version on an x86 machine works as expected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`@batch` slows down other non-`@batch`ed loops with allocations on macOS ARM #89

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

@batch slows down other non-@batched loops with allocations on macOS ARM #89

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`@batch` slows down other non-`@batch`ed loops with allocations on macOS ARM #89