-
-
Notifications
You must be signed in to change notification settings - Fork 103
ThreadError: Attempt to unlock a mutex which is locked by another thread/fiber #501
Comments
On Ruby 3 here is an example of how to produce the issue: def mutex_failure
m = Mutex.new
f1 = Fiber.new do
m.lock
end
f2 = Fiber.new do
m.unlock
end
f1.resume
f2.resume
end
mutex_failure
# ./test.rb:39:in `unlock': Attempt to unlock a mutex which is locked by another thread/fiber (ThreadError)
# from ./test.rb:39:in `block in mutex_failure' |
Okay, with that basic semantic, I managed to figure out the sequence of events which cause this to fail: #!/usr/bin/env ruby
require 'rspec/support/reentrant_mutex'
def rspec_failure
m = RSpec::Support::ReentrantMutex.new
f1 = Fiber.new do
m.synchronize do
puts "f1 A"
Fiber.yield
puts "f1 B"
end
end
f2 = Fiber.new do
m.synchronize do
puts "f2 A"
# Fiber.yield
f1.resume
puts "f2 B"
end
end
f1.resume
f2.resume
f2.resume
end
rspec_failure gives
|
cc @eregon I believe we discussed that there should be no use case where a mutex is locked from one fiber and released on another, but maybe this is a valid use case? Does this mean we should relax the condition, or can you see some other solution? |
#!/usr/bin/env ruby
require 'monitor'
def monitor_failure
m = Monitor.new
f1 = Fiber.new do
m.synchronize do
puts "f1 A"
Fiber.yield
puts "f1 B"
end
end
f2 = Fiber.new do
m.synchronize do
puts "f2 A"
# Fiber.yield
f1.resume
puts "f2 B"
end
end
f1.resume
f2.resume
f2.resume
end
monitor_failure Maybe not a general bug with RSpec, we should consider moving to bugs.ruby-lang.org. |
The The example code here does have an intrinsic problem: a mutex/monitor section is supposed to be atomic but here the atomicity is broken by switching fibers (specifically f1 A happens without f1 B, a data structure could become corrupted if those |
|
After more thoughts, this example does not acquire the lock in a reentrant way (=on the same execution stack, fibers have different stacks), so it should behave exactly the same as Mutex, and the difference is a bug of Monitor (and ReentrantMutex). So it should raise a deadlock error, or if there is a Fiber scheduler active then schedule another Fiber. |
@ioquatix Could you create an issue on bugs.ruby-lang.org? |
The solution here is to change |
Okay so here is the initial problem: I tried adding the following code as a hack: require 'rspec/support/reentrant_mutex'
require 'fiber'
module RSpec
module Support
class ReentrantMutex
def enter
@mutex.lock if @owner != Fiber.current
@owner = Fiber.current
@count += 1
end
end
end
end Now I get other issues I need to figure out. |
Okay, I actually think using |
By the way, on Ruby 2.7.2 I'm getting a And I don't see that this situation changes if we replace To illustrate this: require 'fiber'
class ReentrantMutex
def initialize
@owner = nil
@count = 0
@mutex = Mutex.new
end
def synchronize
enter
yield
ensure
exit
end
private
def enter
@mutex.lock if @owner != Fiber.current
@owner = Fiber.current
@count += 1
end
def exit
@count -= 1
return unless @count == 0
@owner = nil
@mutex.unlock
end
end
m = ReentrantMutex.new
f1 = Fiber.new do
m.synchronize do
puts "f1 A"
Fiber.yield
puts "f1 B"
end
puts "f1 finish"
end
f2 = Fiber.new do
m.synchronize do
puts "f2 A"
# Fiber.yield
f1.resume
puts "f2 B"
end
puts "f2 finish"
end
f1.resume
f2.resume With
|
The piece you are missing is that with a fiber scheduler, locking a mutex on one fiber will allow other fibers to be scheduled. |
Good point. I struggle to understand it from rdoc what is the default behaviour for the default fiber scheduler, is it non-blocking or blocking? |
There is no default fiber scheduler, trying to lock a mutex, transferring to a different fiber and then trying to take the lock again is an error without a scheduler in place. |
Also, speaking of re-entrance, it's puzzling why we nullify m.synchronize do
m.synchronize { ... } # ok, re-entry of the same fiber
Fiber.yield # other fiber locks the mutex. if the scheduled and the fiber are non-blocking, we proceed to the next line
m.synchronize { ... } # BOOM, we lock again, because the `@owner` is the other fiber now
end |
We don't The sequence is:
Abstracting out from
|
The rspec-support/lib/rspec/support/reentrant_mutex.rb Lines 38 to 39 in d63133f
Your example is probably just a variant of @ioquatix's, #502 (comment) is the proper fix. |
…ent Thread or Fiber * In Ruby >= 3, Mutex are held per Fiber, not per Thread. * Fixes rspec#501
…ent Thread or Fiber * In Ruby >= 3, Mutex are held per Fiber, not per Thread. * Fixes rspec#501
…ent Thread or Fiber * In Ruby >= 3, Mutex are held per Fiber, not per Thread. * Fixes rspec#501
…ent Thread or Fiber * In Ruby >= 3, Mutex are held per Fiber, not per Thread. * Fixes rspec#501
…ent Thread or Fiber * In Ruby >= 3, Mutex are held per Fiber, not per Thread. * Fixes #501
Just to follow up with everyone here, https://bugs.ruby-lang.org/issues/17827 was back ported and should be in the next 3.0.x release. |
…y the current Thread or Fiber * In Ruby >= 3, Mutex are held per Fiber, not per Thread. * Fixes rspec/rspec-support#501 --- This commit was imported from rspec/rspec-support@0d64895.
…y the current Thread or Fiber * In Ruby >= 3, Mutex are held per Fiber, not per Thread. * Fixes rspec/rspec-support#501 --- This commit was imported from rspec/rspec-support@a4be9d5.
Thanks, I can take a look. |
Subject of the issue
Your environment
Steps to reproduce
This is a tricky issue to reproduce.
Essentially it looks like some of the assumptions made by
reentrant_mutex.rb
are no longer true in Ruby 3.0. I will review the code and try to give an update, but essentially it looks like trying to lock aReentrantMutex
from a different fiber on the same thread might be a problem.The text was updated successfully, but these errors were encountered: