You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have encountered a deadlock situation when attempting to have nested caches, as in a key-key-value multi-dimensional map, where there are many spawned tasks trying to interact with the cache at once, and the tasks attempt to call wait after insert of the inner value.
All the tokio (non-blocking) threads become blocked on trying to obtain a lock on the cache. The await inside wait allows one of the other tasks destined to block to be scheduled, eventually causing deadlock. The code below, inner_cache.wait() happens while a read lock on lru is held. Avoiding the inner_cache.wait() call prevents the deadlock.
We have restructured our code so it no longer uses nested caches and avoids this situation, but I wanted to share this discovery. This may not be a usage pattern that is intended to work or be supported. But to my eyes it is not immediately obvious that this would deadlock. I speculate that if the CacheProcessor was spawned as a dedicated thread instead of a green thread it might be able to complete the wait group and unstick things (we didn't experience a deadlock with this same design pattern when using the sync Cache).
Here is an example test that can demonstrate the deadlock:
#[tokio::test(flavor = "multi_thread")]asyncfntest_wait_with_read_lock(){let max_cost = 10_000;let lru = Arc::new(AsyncCacheBuilder::new(max_cost *10, max_cost asi64).set_ignore_internal_cost(true).finalize(tokio::spawn).expect("failed to create cache"),);let key = 1;let cost = 1;letmut tasks = Vec::new();for i in1..=10_000{let lru = Arc::clone(&lru);
tasks.push(tokio::spawn(asyncmove{let inner_cache = match lru.get(&key){Some(v) => v,None => {let inner_lru = AsyncCacheBuilder::new(max_cost *10, max_cost asi64).set_ignore_internal_cost(true).finalize(tokio::spawn).expect("failed to create cache");
lru.insert(key, inner_lru, cost).await;
lru.wait().await.unwrap();
lru.get(&key).unwrap()}};let inner_cache = inner_cache.value();
inner_cache.insert(i,123, cost).await;eprintln!("i = {i}, len before wait = {}", inner_cache.len());// removing this wait avoids deadlock
inner_cache.wait().await.unwrap();}));}for task in tasks {
task.await.unwrap();}}
The text was updated successfully, but these errors were encountered:
Hi @jrray, I think this should be caused by the AsyncWaitGroup implementation of https://github.com/al8n/wg. I suspect we need to replace parking_lot::Mutex and parking_lot::RwLock with async locks for the AsyncCache.
We have encountered a deadlock situation when attempting to have nested caches, as in a key-key-value multi-dimensional map, where there are many spawned tasks trying to interact with the cache at once, and the tasks attempt to call
wait
after insert of the inner value.All the tokio (non-blocking) threads become blocked on trying to obtain a lock on the cache. The
await
insidewait
allows one of the other tasks destined to block to be scheduled, eventually causing deadlock. The code below,inner_cache.wait()
happens while a read lock onlru
is held. Avoiding theinner_cache.wait()
call prevents the deadlock.We have restructured our code so it no longer uses nested caches and avoids this situation, but I wanted to share this discovery. This may not be a usage pattern that is intended to work or be supported. But to my eyes it is not immediately obvious that this would deadlock. I speculate that if the
CacheProcessor
was spawned as a dedicated thread instead of a green thread it might be able to complete the wait group and unstick things (we didn't experience a deadlock with this same design pattern when using the syncCache
).Here is an example test that can demonstrate the deadlock:
The text was updated successfully, but these errors were encountered: