Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ABA in local queue #5041

Closed
sbarral opened this issue Sep 21, 2022 · 2 comments · Fixed by #5042
Closed

ABA in local queue #5041

sbarral opened this issue Sep 21, 2022 · 2 comments · Fixed by #5042
Labels
A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime

Comments

@sbarral
Copy link
Contributor

sbarral commented Sep 21, 2022

When I first noticed that the local tokio queue is prone to ABA, I thought that this was the prime reason for the use of u16 indices. This mitigation seems very weak though, so I thought I would file an issue in case this was in fact overlooked.

Say that a stealer thread is pre-empted in the below code section of Steal::steal_into2 after the tail is loaded but before the CAS is performed, with the intent to steal half of the N tasks initially available. Say that in the meantime the worker thread pops exactly 2^16 tasks but pushes less than 2^16 - N/2 tasks. Since the head has wrapped around, the CAS will wrongly succeed and the stealer will steal more items than are actually in the queue, hence UB.

let src_tail = self.0.tail.load(Acquire);
// If these two do not match, another thread is concurrently
// stealing from the queue.
if src_head_steal != src_head_real {
return 0;
}
// Number of available tasks to steal
let n = src_tail.wrapping_sub(src_head_real);
let n = n - n / 2;
if n == 0 {
// No tasks available to steal
return 0;
}
// Update the real head index to acquire the tasks.
let steal_to = src_head_real.wrapping_add(n);
assert_ne!(src_head_steal, steal_to);
next_packed = pack(src_head_steal, steal_to);
// Claim all those tasks. This is done by incrementing the "real"
// head but not the steal. By doing this, no other thread is able to
// steal from this queue until the current thread completes.
let res = self
.0
.head
.compare_exchange(prev_packed, next_packed, AcqRel, Acquire);

I am not sure this is a purely theoretical concern. The time to push and pop 2^16 tasks is less than 1ms on my oldish laptop, so probably not entirely outside the realm of the possible for worse-case thread suspension. Admittedly this doesn't take into account the time needed to actually process the tasks, but this still looks concerning to me.

On 64-bit platforms a very good ABA mitigation would come at no cost by using u32 in place of u16 and u64 in place of u32. According to my benchmarks, this may even be very slightly faster. On 32-bit platforms, what to do is less obvious: keep the status quo and accept the risk, or use u32 and u64 when AtomicU64 exists and accept a possible degradation in performance.

@Darksonn Darksonn added A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime labels Sep 21, 2022
@carllerche
Copy link
Member

Thanks for the report. I think the risk is pretty low, but we can bump it to u32 on 64 bit machines. You want to submit a PR?

@sbarral
Copy link
Contributor Author

sbarral commented Sep 21, 2022

Will do!

Darksonn pushed a commit that referenced this issue Sep 27, 2022
When 64-bit atomics are supported, use 32-bit queue indices. This
greatly improves resilience to ABA and has no impact on performance on
64-bit platforms.

Fixes: #5041
dbischof90 pushed a commit to dbischof90/tokio that referenced this issue Oct 1, 2022
…rs#5042)

When 64-bit atomics are supported, use 32-bit queue indices. This
greatly improves resilience to ABA and has no impact on performance on
64-bit platforms.

Fixes: tokio-rs#5041
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants