ABA in local queue #5041

sbarral · 2022-09-21T16:59:20Z

When I first noticed that the local tokio queue is prone to ABA, I thought that this was the prime reason for the use of u16 indices. This mitigation seems very weak though, so I thought I would file an issue in case this was in fact overlooked.

Say that a stealer thread is pre-empted in the below code section of Steal::steal_into2 after the tail is loaded but before the CAS is performed, with the intent to steal half of the N tasks initially available. Say that in the meantime the worker thread pops exactly 2^16 tasks but pushes less than 2^16 - N/2 tasks. Since the head has wrapped around, the CAS will wrongly succeed and the stealer will steal more items than are actually in the queue, hence UB.

tokio/tokio/src/runtime/scheduler/multi_thread/queue.rs

Lines 369 to 397 in 7096a80

    
           let src_tail = self.0.tail.load(Acquire); 
        
           // If these two do not match, another thread is concurrently 
        
           // stealing from the queue. 
        
           if src_head_steal != src_head_real { 
        
               return 0; 
        
           } 
        
           // Number of available tasks to steal 
        
           let n = src_tail.wrapping_sub(src_head_real); 
        
           let n = n - n / 2; 
        
           if n == 0 { 
        
               // No tasks available to steal 
        
               return 0; 
        
           } 
        
           // Update the real head index to acquire the tasks. 
        
           let steal_to = src_head_real.wrapping_add(n); 
        
           assert_ne!(src_head_steal, steal_to); 
        
           next_packed = pack(src_head_steal, steal_to); 
        
           // Claim all those tasks. This is done by incrementing the "real" 
        
           // head but not the steal. By doing this, no other thread is able to 
        
           // steal from this queue until the current thread completes. 
        
           let res = self 
        
               .0 
        
               .head 
        
               .compare_exchange(prev_packed, next_packed, AcqRel, Acquire);

I am not sure this is a purely theoretical concern. The time to push and pop 2^16 tasks is less than 1ms on my oldish laptop, so probably not entirely outside the realm of the possible for worse-case thread suspension. Admittedly this doesn't take into account the time needed to actually process the tasks, but this still looks concerning to me.

On 64-bit platforms a very good ABA mitigation would come at no cost by using u32 in place of u16 and u64 in place of u32. According to my benchmarks, this may even be very slightly faster. On 32-bit platforms, what to do is less obvious: keep the status quo and accept the risk, or use u32 and u64 when AtomicU64 exists and accept a possible degradation in performance.

The text was updated successfully, but these errors were encountered:

carllerche · 2022-09-21T17:39:18Z

Thanks for the report. I think the risk is pretty low, but we can bump it to u32 on 64 bit machines. You want to submit a PR?

sbarral · 2022-09-21T17:44:52Z

Will do!

When 64-bit atomics are supported, use 32-bit queue indices. This greatly improves resilience to ABA and has no impact on performance on 64-bit platforms. Fixes: #5041

…rs#5042) When 64-bit atomics are supported, use 32-bit queue indices. This greatly improves resilience to ABA and has no impact on performance on 64-bit platforms. Fixes: tokio-rs#5041

Darksonn added A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime labels Sep 21, 2022

sbarral mentioned this issue Sep 21, 2022

mitigate ABA with 32-bit queue indices if possible #5042

Merged

Darksonn closed this as completed in #5042 Sep 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ABA in local queue #5041

ABA in local queue #5041

sbarral commented Sep 21, 2022

carllerche commented Sep 21, 2022

sbarral commented Sep 21, 2022

ABA in local queue #5041

ABA in local queue #5041

Comments

sbarral commented Sep 21, 2022

carllerche commented Sep 21, 2022

sbarral commented Sep 21, 2022