You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To support destructible and sinkable types, in particular atomic refcounted types, tasks must zero-init their data buffer.
This is introduced in #144 to properly support the refcounted FlowEvent.
However there is a significant 17% overhead on very short running tasks like Fibonacci(40)
Note: significant is relative, fibonacci spawns 2^40 tasks which are in the trillions and each task is simpler than zero initialization
procnewTaskFromCache*(): Task=result= workerContext.taskCache.pop()
result= workerContext.taskCache.pop0()
ifresult.isNil:
ifresult.isNil:
result=myMemPool().borrow(deref(Task))
result=myMemPool().borrow0(deref(Task))
# Zeroing is expensive, it's 96 bytes# The task must be fully zero-ed including the data buffer# otherwise datatypes that use custom destructors# result.fn = nil # Always overwritten# and that rely on "myPointer.isNil" to return early# result.parent = nil # Always overwritten# may read recycled garbage data.# result.scopedBarrier = nil # Always overwritten# "FlowEvent" is such an exampleresult.prev =nilresult.next =nil#TODO: The perf cost to the following is 17% as measured on fib(40)result.start =0result.cur =0# # Zeroing is expensive, it's 96 bytesresult.stop =0# # result.fn = nil # Always overwrittenresult.stride =0# # result.parent = nil # Always overwrittenresult.futures =nil# # result.scopedBarrier = nil # Always overwrittenresult.isLoop =false# result.prev = nilresult.hasFuture =false# result.next = nil# result.start = 0# result.cur = 0# result.stop = 0# result.stride = 0# result.futures = nil# result.isLoop = false# result.hasFuture = false
The simple optimization would be to only zero init the part of the buffer that will be overwritten.
An alternative would be to zero init the buffer only for non-trivial types as detected by supportsCopyMem.
And a third possiblity would be to do both.
The text was updated successfully, but these errors were encountered:
To support destructible and sinkable types, in particular atomic refcounted types, tasks must zero-init their data buffer.
This is introduced in #144 to properly support the refcounted FlowEvent.
However there is a significant 17% overhead on very short running tasks like Fibonacci(40)
The change: https://github.com/mratsim/weave/pull/144/files#diff-c5d52e34ee454756d2c729faec306b62L113
The simple optimization would be to only zero init the part of the buffer that will be overwritten.
An alternative would be to zero init the buffer only for non-trivial types as detected by
supportsCopyMem
.And a third possiblity would be to do both.
The text was updated successfully, but these errors were encountered: