Revert "partr: fix multiqueue resorting stability" #35589

JeffBezanson · 2020-04-24T17:48:42Z

This reverts commit c23554a.

This caused pfib to run ~forever.

This reverts commit c23554a.

vtjnash · 2020-04-25T15:56:20Z

Hm, any idea why? I know I split out that commit so it could be bisected (and reverted) easily, but it'd be good to understand

JeffBezanson · 2020-04-25T18:20:45Z

All the tasks have the same priority, so I think this was causing it to iterate through most of them on each operation.

JeffBezanson · 2020-04-27T20:46:04Z

@kpamnany What is the right thing to do here to dequeue tasks in the right order without being O(n^2)?

kpamnany · 2020-04-28T14:48:05Z

@JeffBezanson: What's the 'right' order? Originally partr used <= in both sift_up() and sift_down(). So in multiq_insert(), the most recently added task was moved to the top of the heap and after a multiq_deletemin(), the task from the bottom of the heap moved back down to the bottom. This honors the depth first strategy.

Looks like this has been changed to < in both... but not sure why.

JeffBezanson · 2020-04-28T15:05:57Z

Looks like this has been changed to < in both... but not sure why.

Because using <= causes the code to traverse a large part of the structure on each operation, making the scheduler O(n^2) and then we spend all of run time there.

kpamnany · 2020-04-28T16:19:40Z

Have you confirmed that with profiling? If not, I should be able to finally get started on doing some of that this week.

It shouldn't really happen that sifting consumes a lot of runtime. There are 2 * jl_n_threads heaps that are 8-ary trees laid out breadth-first, i.e. 8 task movements in a single cache line. If you have a workload that creates a lot of tasks which yield repeatedly, then perhaps it could get slow -- that's a bad profile for this approach.

If you increase heap_c to 4 here you'll get some speedup for these workloads, but there'll be some overhead when there aren't many tasks.

JeffBezanson · 2020-04-28T16:45:16Z

Yes. Try running this:

function pfib(n::Int)
    if n <= 1
        return n
    end
    t = Threads.@spawn pfib(n-2)
    return pfib(n-1) + fetch(t)::Int
end
pfib(25)

with 2 threads.
With <:

  0.176900 seconds (729.64 k allocations: 87.096 MiB, 12.04% gc time)

with <=:

  6.887915 seconds (728.51 k allocations: 87.061 MiB, 0.49% gc time)

I admit this is a pathological case, but I don't think it's ok for the default scheduler to have any worst case this bad.

kpamnany · 2020-04-28T17:47:15Z

Yes. There's a 7.5X slowdown with fib in the C partr as well. And we aren't even using the priorities right now AFAICT. This needs some thought.

* origin/master: (833 commits) Improve typesubtract for tuples (#35600) Make searchsorted*/findnext/findprev return values of keytype (#32978) fix buggy rand(RandomDevice(), Bool) (#35590) remove `Ref` allocation on task switch (#35606) Revert "partr: fix multiqueue resorting stability" (#35589) exclude types with free variables from `type_morespecific` (#35555) fix small typo in NEWS.md (#35611) enable inline allocation of structs with pointers (#34126) SparseArrays: Speed up right-division by diagonal matrices (#35533) Allow hashing 1D OffsetArrays NEWS item for introspection macros (#35594) Special case empty covec-diagonal-vec product (#35557) [GCChecker] fix a few tests by looking through casts Use norm instead of abs in generic lu factorization (#34575) [GCChecker,NFC] run clang-format -style=llvm [GCChecker] fix tests and add Makefile Add introspection macros support for dot syntax (#35522) Specialize `union` for `OneTo` (#35577) add pop!(vector, idx, [default]) (#35513) bump Pkg version (#35584) ...

Revert "partr: fix multiqueue resorting stability"

bf53304

This reverts commit c23554a.

JeffBezanson added the domain:multithreading Base.Threads and related functionality label Apr 24, 2020

JeffBezanson mentioned this pull request Apr 26, 2020

remove Ref allocation on task switch #35606

Merged

JeffBezanson merged commit 47087cb into master Apr 27, 2020

JeffBezanson deleted the jb/siftdown branch April 27, 2020 20:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "partr: fix multiqueue resorting stability" #35589

Revert "partr: fix multiqueue resorting stability" #35589

JeffBezanson commented Apr 24, 2020

vtjnash commented Apr 25, 2020

JeffBezanson commented Apr 25, 2020

JeffBezanson commented Apr 27, 2020

kpamnany commented Apr 28, 2020

JeffBezanson commented Apr 28, 2020

kpamnany commented Apr 28, 2020

JeffBezanson commented Apr 28, 2020

kpamnany commented Apr 28, 2020

Revert "partr: fix multiqueue resorting stability" #35589

Revert "partr: fix multiqueue resorting stability" #35589

Conversation

JeffBezanson commented Apr 24, 2020

vtjnash commented Apr 25, 2020

JeffBezanson commented Apr 25, 2020

JeffBezanson commented Apr 27, 2020

kpamnany commented Apr 28, 2020

JeffBezanson commented Apr 28, 2020

kpamnany commented Apr 28, 2020

JeffBezanson commented Apr 28, 2020

kpamnany commented Apr 28, 2020