Description
note: despite the similarity in title to #415, i think this use case is sufficiently different enough to open a separate issue. #415 is being caused by an inability to distinguish a re-acquired lock's timestamp when the lock is lost through expiration. i believe the issue described below would still be present even if that were fixed, due to sharing a single lock instance.
finally had time to put together a writeup for this - it is possible to incorrectly release a lock that has a timeout when sharing a Lock
instance within a single process (this also affects LuaLock
from #411).
- create a lock instance and save it somewhere shared between all threads/coroutines
- thread0
acquire()
succeeds - thread1
acquire()
blocks - lock times out, key expires
- thread1
acquire()
succeeds - thread0 releases thread1's lock
since both contexts use the same instance, when thread0 calls release()
, the token or timestamp matches the value in redis and the lock is released. this could happen when an operation takes far longer than expected, say the database thread0 was talking to dies and it has to wait for a TCP connection to time out.
i know dogpile.cache shares lock instances in this way, there are likely other libraries/programs affected by this.
the first question is whether this is a bug - this only happens if one of the callers loses the lock because it ran over the time limit set by the programmer (who knows what this value should be to cover all cases - hah). calling code could handle this case by making a new instance for each context that wishes to access the lock (edit: this would work for LuaLock
, #415 shows there's an additional problem in Lock
).
if there is a desire to handle this in redis-py, i propose adding a parameter defaulting to None to StrictRedis.lock()
and both locks' __init__()
, which receives a function that generates a string to prepend on the value stored in redis. this string is defined by the caller to distinguish between contexts in which the lock is accessed (the thread id, for example). the prefix string must be generated each time it is needed, so when release()
is called by thread0 in the above example, it would ask the lua script to unlock the right lock value with the wrong prefix. this has the benefit of being customizable for calling code while having a minimal effect on programs that don't need this functionality.
in general, i think it would aid debugging if an attempt to release a lock owned by another client becomes an error condition - the current Lock
silently ignores this case and the LuaLock
implementation from #411 carries over that behavior.
thoughts?