Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug in Lock's release() function #415

Closed
JerryKwan opened this issue Dec 18, 2013 · 4 comments
Closed

bug in Lock's release() function #415

JerryKwan opened this issue Dec 18, 2013 · 4 comments

Comments

@JerryKwan
Copy link

there is a bug in Lock's release() function.
below is a test script.

import redis
import time

r = redis.StrictRedis(host = '192.168.56.1', port = 6379, db = 0)

lock = r.lock('hash:lock', 5)
print 'first lock acquire: ', lock.acquire()
print 'current time: ', time.time()
print 'lock value: ', r.get('hash:lock')
# time.sleep(21)
second_lock = r.lock('hash:lock', 10)
print 'second_lock acquire: ', second_lock.acquire(True)
print 'current time: ', time.time()
print 'lock value: ', r.get('hash:lock')
# first lock release
print 'release first lock'
lock.release()
print 'lock value: ', r.get('hash:lock')
third_lock = r.lock('hash:lock', 10)
print 'current time: ', time.time()
print 'third_lock acquire: ', third_lock.acquire(False)
print 'lock value: ', r.get('hash:lock')

and the result is:
first lock acquire: True
current time: 1387347026.74
lock value: 1387347031.729
second_lock acquire: True
current time: 1387347031.75
lock value: 1387347041.741
release first lock
lock value: 1387347041.741
current time: 1387347031.75
third_lock acquire: True
lock value: 1387347041.741

as the result shows, the third lock got the lock, but at the time the second lock is not released.
why?
look at the release() function, there is a check if should delete the key:

def release(self):
        "Releases the already acquired lock"
        if self.acquired_until is None:
            raise ValueError("Cannot release an unlocked lock")
        existing = float(self.redis.get(self.name) or 1)
        # if the lock time is in the future, delete the lock
        delete_lock = existing >= self.acquired_until
        self.acquired_until = None
        if delete_lock:
            self.redis.delete(self.name)

as the test script describes: first lock acquire , then second acquire, then first acquire release, but we should not delete the key because the lock is acquired by second lock, the ower is second lock not first lock, so at that time the first lock's release method should do nothing.
may be the logic in release function is wrong, i think we should change it as follows:

def release(self):
        "Releases the already acquired lock"
        if self.acquired_until is None:
            raise ValueError("Cannot release an unlocked lock")
        existing = float(self.redis.get(self.name) or 1)
        # if the lock time is out dated, delete the lock
        delete_lock = existing <= self.acquired_until
        self.acquired_until = None
        if delete_lock:
            self.redis.delete(self.name)
@chillipino
Copy link
Contributor

i haven't tried your proposed code change, but it looks like it skips deleting the lock if the timeout hasn't passed yet, so the timeout on Lock would actually be a minimum lifetime. it's only possible for release() to delete the key after the timeout has expired. it also always deletes the key if the key doesn't exist (because of the or 1 in the float() call).

that said, i do agree there's a bug with locks that have timeouts. as currently coded, once the timeout has expired on a lock instance, that lock instance is in an inconsistent state and shouldn't be used to release the lock. no attempt is made to detect this case in release(), so the behavior you pointed out is possible. i tried your test script using the LuaLock implementation from #411, and it works as expected. maybe it would be good to consider moving to a random-string-token-style implementation for Lock as well?

TL;DR: i think storing just a timestamp in redis is insufficient to determine whether a lock was lost and acquired by another process when using timeouts.

i just opened #416, which covers a similar (but not identical) case.

edit: decreased snark

@JerryKwan
Copy link
Author

@chillipino
in the modified release() function, when a lock is not timeout but release() is invoked, then existing = self.acquired_until, so the delete_lock is True, and then the key is deleted.
test script follows:

import redis
import time

r = redis.StrictRedis(host = '192.168.56.1', port = 6379, db = 0)

lock = r.lock('hash:lock', 5)
print 'first lock acquire: ', lock.acquire()
print 'current time: ', time.time()
print 'lock value: ', r.get('hash:lock')
print 'release lock'
print 'current time: ', time.time()
lock.release()
print 'lock value: ', r.get('hash:lock')

and the test result shows like this:
first lock acquire: True
current time: 1387419425.39
lock value: 1387419430.38
release lock
current time: 1387419425.39
lock value: None

as the test result shows, when a lock is released before timeout, the key is deleted as designed.

@chillipino
Copy link
Contributor

ah, i missed the == case, sorry about that - i do have a hard time believing that code that has been in use for years could be fixed by flipping a comparison. :) it does still attempt to delete the key if it does not exist, though. actually, if another machine happens to acquire the lock between the get() and delete() in the case where get() returns None, the delete() will release the other machine's lock. release() probably needs to be a transaction to be really safe.

i still think the best thing to do for locking is to store a value in the server that doesn't depend on clock synchronization across all clients. dealing with floating point serialization, parsing, and comparison makes this worse, especially when the clients involved may be running on different platforms and architectures. there's even an example above - print time.time() is printing 2 digits of precision in the output from the test script. when i run that statement, i get 6-7 digits of precision in both python 2 and 3 (the same is true when using repr()), so my clock will still appear to be ahead of @JerryKwan's if they're perfectly synced. the second problem mentioned in #384 is another example.

@andymccurdy
Copy link
Contributor

The new lock system is out in 2.10. Thanks for your input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants