Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix merging race conditions #570

Merged
merged 1 commit into from
Jul 17, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 13 additions & 8 deletions lib/simplecov.rb
Original file line number Diff line number Diff line change
Expand Up @@ -75,23 +75,21 @@ def add_not_loaded_files(result)
# from cache using SimpleCov::ResultMerger if use_merging is activated (default)
#
def result
# Ensure the variable is defined to avoid ruby warnings
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was ostensibly added for rubocop, but rubocop and ruby don't complain about it now, so ¯_(ツ)_/¯

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably was solved some time when result? was introduced/use or so

@result = nil unless defined?(@result)
return @result if result?

# Collect our coverage result
if running && !result?
if running
@result = SimpleCov::Result.new add_not_loaded_files(Coverage.result)
end

# If we're using merging of results, store the current result
# first, then merge the results and return those
# first (if there is one), then merge the results and return those
if use_merging
SimpleCov::ResultMerger.store_result(@result) if result?

SimpleCov::ResultMerger.merged_result
else
@result
@result = SimpleCov::ResultMerger.merged_result
end

@result
ensure
self.running = false
end
Expand Down Expand Up @@ -158,6 +156,13 @@ def usable?
false
end
end

#
# Clear out the previously cached .result. Primarily useful in testing
#
def clear_result
@result = nil
end
end
end

Expand Down
55 changes: 42 additions & 13 deletions lib/simplecov/result_merger.rb
Original file line number Diff line number Diff line change
Expand Up @@ -17,25 +17,31 @@ def resultset_writelock
File.join(SimpleCov.coverage_path, ".resultset.json.lock")
end

# Loads the cached resultset from JSON and returns it as a Hash
# Loads the cached resultset from JSON and returns it as a Hash,
# caching it for subsequent accesses.
def resultset
if stored_data
begin
JSON.parse(stored_data)
rescue
@resultset ||= begin
data = stored_data
if data
begin
JSON.parse(data) || {}
rescue
{}
end
else
{}
end
else
{}
end
end

# Returns the contents of the resultset cache as a string or if the file is missing or empty nil
def stored_data
return unless File.exist?(resultset_path)
data = File.read(resultset_path)
return if data.nil? || data.length < 2
data
synchronize_resultset do
return unless File.exist?(resultset_path)
data = File.read(resultset_path)
return if data.nil? || data.length < 2
data
end
end

# Gets the resultset hash and re-creates all included instances
Expand Down Expand Up @@ -76,8 +82,9 @@ def merged_result

# Saves the given SimpleCov::Result in the resultset cache
def store_result(result)
File.open(resultset_writelock, "w+") do |f|
f.flock(File::LOCK_EX)
synchronize_resultset do
# Ensure we have the latest, in case it was already cached
clear_resultset
new_set = resultset
command_name, data = result.to_hash.first
new_set[command_name] = data
Expand All @@ -87,6 +94,28 @@ def store_result(result)
end
true
end

# Ensure only one process is reading or writing the resultset at any
# given time
def synchronize_resultset
# make it reentrant
return yield if defined?(@resultset_locked) && @resultset_locked
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need the defined? check? It should default to nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but it issues a ruby warning instance variable @resultset_locked not initialized, which fails the build, thanks to spec/support/fail_rspec_on_ruby_warning.rb

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also still not sure what making it reentrant here means and accomplishes, but reentrancy also has been a long time ago for me. The way I see it, the block we are given is now also called when the result set is already locked. Doesn't this effectively circumvent the lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so the idea is that if we call synchronize_resultset when we are already inside a synchronize_resultset call, we just let it through, since we know we already have a lock from the outer call. So we're not circumventing the lock, so much as saying "we already have a lock higher in the call stack, so just run the block". Due to how flock works, we would get a deadlock if we didn't, since the inner flock would wait forever for the outer one to release.

The reason we need to do this is we want to make both reads and writes acquire a lock so that neither one does anything unsafe. And in order to safely do a write, we 1. first need to read and 2. we need to ensure nobody else writes between our read and our write, otherwise we might lose data. So to accomplish that, write does synchronize_resultset around the whole thing so that any other processes have to wait. And since its nested read will also do synchronize_resultset, we use @resultset_locked to let it know it doesn't actually need a lock since we already got one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which works as we are just one process, got it, thanks for taking the time to explain! :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of that said, do you happen to know of a way to test this? We could probably extract synchronize_resultset and yse that to write some specs around it which might be worthwhile, not sure. I just see code that I have questions about and often feel better, the better that code is tested :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I have a couple ideas, I'll tweak the PR with some more tests


begin
@resultset_locked = true
File.open(resultset_writelock, "w+") do |f|
f.flock(File::LOCK_EX)
yield
end
ensure
@resultset_locked = false
end
end

# Clear out the previously cached .resultset
def clear_resultset
@resultset = nil
end
end
end
end
93 changes: 88 additions & 5 deletions spec/result_merger_spec.rb
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
require "helper"
require "tempfile"
require "timeout"

if SimpleCov.usable?
describe SimpleCov::ResultMerger do
before do
SimpleCov::ResultMerger.clear_resultset
File.delete(SimpleCov::ResultMerger.resultset_path) if File.exist?(SimpleCov::ResultMerger.resultset_path)
end

describe "with two faked coverage resultsets" do
before do
SimpleCov.use_merging true
@resultset1 = {
source_fixture("sample.rb") => [nil, 1, 1, 1, nil, nil, 1, 1, nil, nil],
source_fixture("app/models/user.rb") => [nil, 1, 1, 1, nil, nil, 1, 0, nil, nil],
Expand Down Expand Up @@ -73,15 +79,92 @@
expect(SimpleCov::ResultMerger.results.length).to eq(1)
end
end
end
end
end

describe ".store_result" do
it "refreshes the resultset" do
set = SimpleCov::ResultMerger.resultset
SimpleCov::ResultMerger.store_result({})
new_set = SimpleCov::ResultMerger.resultset
expect(new_set).not_to be(set)
end

context "with merging disabled" do
before { SimpleCov.use_merging false }
it "persists to disk" do
SimpleCov::ResultMerger.store_result("a" => [1])
SimpleCov::ResultMerger.clear_resultset
new_set = SimpleCov::ResultMerger.resultset
expect(new_set).to eq("a" => [1])
end

it "synchronizes writes" do
expect(SimpleCov::ResultMerger).to receive(:synchronize_resultset)
SimpleCov::ResultMerger.store_result({})
end
end

it "returns nil for SimpleCov.result" do
expect(SimpleCov.result).to be_nil
describe ".resultset" do
it "caches" do
set = SimpleCov::ResultMerger.resultset
new_set = SimpleCov::ResultMerger.resultset
expect(new_set).to be(set)
end

it "synchronizes reads" do
expect(SimpleCov::ResultMerger).to receive(:synchronize_resultset)
SimpleCov::ResultMerger.resultset
end
end

describe ".synchronize_resultset" do
it "is reentrant (i.e. doesn't block its own process)" do
# without @resultset_locked, this spec would fail and
# `.store_result` wouldn't work
expect do
Timeout.timeout(1) do
SimpleCov::ResultMerger.synchronize_resultset do
SimpleCov::ResultMerger.synchronize_resultset do
end
end
end
end.not_to raise_error
end

it "blocks other processes" do
file = Tempfile.new("foo")

other_process = open("|ruby -e " + Shellwords.escape(<<-CODE) + " 2>/dev/null")
require "simplecov"
SimpleCov.coverage_dir(#{SimpleCov.coverage_dir.inspect})
# ensure the parent process has enough time to get a
# lock before we do
sleep 0.5
$stdout.sync = true
puts "running" # see `sleep`s in parent process
SimpleCov::ResultMerger.synchronize_resultset do
File.open(#{file.path.inspect}, "a") { |f| f.write("process 2\n") }
end
CODE

SimpleCov::ResultMerger.synchronize_resultset do
# wait until the child process is going, and then wait some more
# so we can be sure it blocks on the lock we already have.
sleep 0.1 until other_process.gets == "running\n"
sleep 1

# despite the sleeps, this will be written first since we got
# the first lock
File.open(file.path, "a") { |f| f.write("process 1\n") }
end

# wait for it to finish
other_process.gets

expect(file.read).to eq("process 1\nprocess 2\n")
end
end
end
Expand Down
109 changes: 109 additions & 0 deletions spec/simplecov_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
require "helper"

if SimpleCov.usable?
describe SimpleCov do
describe ".result" do
before do
SimpleCov.clear_result
allow(Coverage).to receive(:result).once.and_return({})
end

context "with merging disabled" do
before do
allow(SimpleCov).to receive(:use_merging).once.and_return(false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't we say SimpleCov.use_merging(false)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suppose, if we set it back. this feels safer IMO, since we're not changing instance variables of the SimpleCov class, which may have side effects in later specs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, fair point thanks! 👍

end

context "when not running" do
before do
allow(SimpleCov).to receive(:running).and_return(false)
end

it "returns nil" do
expect(SimpleCov.result).to be_nil
end
end

context "when running" do
before do
allow(SimpleCov).to receive(:running).and_return(true, false)
end

it "uses the result from Coverage" do
expect(Coverage).to receive(:result).once.and_return(__FILE__ => [0, 1])
expect(SimpleCov.result.filenames).to eq [__FILE__]
end

it "adds not-loaded-files" do
expect(SimpleCov).to receive(:add_not_loaded_files).once.and_return({})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as this is basically ourselves, I'd think rather testing the effect that add_not_loaded_files has would make sense instead of mocking it. What do you think?

SimpleCov.result
end

it "doesn't store the current coverage" do
expect(SimpleCov::ResultMerger).not_to receive(:store_result)
SimpleCov.result
end

it "doesn't merge the result" do
expect(SimpleCov::ResultMerger).not_to receive(:merged_result)
SimpleCov.result
end

it "caches its result" do
result = SimpleCov.result
expect(SimpleCov.result).to be(result)
end
end
end

context "with merging enabled" do
let(:the_merged_result) { double }

before do
allow(SimpleCov).to receive(:use_merging).once.and_return(true)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't we just do SimpleCov.use_merging(false) ?

allow(SimpleCov::ResultMerger).to receive(:store_result).once
allow(SimpleCov::ResultMerger).to receive(:merged_result).once.and_return(the_merged_result)
end

context "when not running" do
before do
allow(SimpleCov).to receive(:running).and_return(false)
end

it "merges the result" do
expect(SimpleCov.result).to be(the_merged_result)
end
end

context "when running" do
before do
allow(SimpleCov).to receive(:running).and_return(true, false)
end

it "uses the result from Coverage" do
expect(Coverage).to receive(:result).once.and_return({})
SimpleCov.result
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we might wanna assert the return value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little weird here because .result just gives us the the_merged_result it got from SimpleCov::ResultMerger ... so I don't know how useful it is to check the return value here.

That said, I tweaked the similar spec on line 31 (non merging scenario) to show that the Coverage data does indeed get used

end

it "adds not-loaded-files" do
expect(SimpleCov).to receive(:add_not_loaded_files).once.and_return({})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as add_not_loaded_files is on the thing under test I thinkw e mgiht opt to test what it does not that it gets called, wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, well i'd say SimpleCov.result is the thing under test, not SimpleCov, so i kind of like this test ¯_(ツ)_/¯ ... maybe add_not_loaded_files could/should by handled by a separate object (though that's a bigger refactor) ? at any rate, it feels weird for a .result test to have to stub/fake a bunch of File/Dir stuff. though i do agree some sort of test around add_not_loaded_files' behavior would be nice

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not wanting to stub a bunch of file stuff is a perfectly fine reason for stubbing early 👍 I somehow thought the effects would be easier to test, but with this I much prefer the current stub. True, maybe a refactoring for another object but I'd have to dig deeper to see what scope would make sense there.

SimpleCov.result
end

it "stores the current coverage" do
expect(SimpleCov::ResultMerger).to receive(:store_result).once
SimpleCov.result
end

it "merges the result" do
expect(SimpleCov.result).to be(the_merged_result)
end

it "caches its result" do
result = SimpleCov.result
expect(SimpleCov.result).to be(result)
end
end
end
end
end
end