Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Overall Statistics Calculation without loading the files that are covered #573

Open
jgonera opened this issue Apr 14, 2017 · 5 comments
Labels

Comments

@jgonera
Copy link

jgonera commented Apr 14, 2017

Let's say I have a .resultset.json file but no source code. This file should be enough to generate some basics stats, like total code coverage, total number of lines, etc.

It is not possible now because SimpleCov refuses to record any data if the source code is not present:
https://github.com/colszowka/simplecov/blob/9bb1aa3a6735b16d4026c4c34db0de6973d9a883/lib/simplecov/result.rb#L29

This is unnecessary and very problematic for distributed CI systems. Even if the source code is there, it's even worse for big code bases because to calculate the total coverage, SimpleCov loads every source file: https://github.com/colszowka/simplecov/blob/05d877d1956bed364dfa4018edbc50c213ac94fe/lib/simplecov/source_file.rb#L83-L89 (and other parts of this class).

Is there any technical reason for this that I'm missing?

@PragTob
Copy link
Collaborator

PragTob commented Apr 15, 2017

Hi there,

the major technical reason is that the goal of SimpleCov is to show which lines are covered and which aren't - having some overall statistics is fine but they don't help you much (imo) if you can't have a look which lines are covered and which aren't as you can't go and say "Ok maybe we should test MyClass#critical_method as the tests seem to be missing it"

I believe this is the main reason this isn't happening - if you want to do a PR fixing this feel welcome but personally I doubt we'll tackle this soon :)

@colszowka
Copy link
Collaborator

colszowka commented Apr 15, 2017 via email

@bf4
Copy link
Collaborator

bf4 commented Apr 15, 2017

related: #558 (comment)

I found that the files list wasn't set correctly if I was trying to load and merge the results from another machine.
Probably good for a future pr...
but doesn't work with html formatter because it tries to read each file to overlay coverage in report source_file.rb

 def src
      # We intentionally read source code lazily to
      # suppress reading unused source code.
      @src ||= File.open(filename, "rb", &:readlines)
    end
require 'simplecov'
require 'json'
SimpleCov.coverage_dir 'coveragetest'
SimpleCov.formatters = [SimpleCov::Formatter::SimpleFormatter]
sourcefiles = Dir.glob('./resultset*') 
# work around  `if File.file?(filename)` when running on other filesystem
add_files = ->(sr) { 
  files = sr.
    original_result.
    map {|filename, coverage| SimpleCov::SourceFile.new(filename, coverage) }.
    compact.
    sort_by(&:filename);
  sr.instance_variable_set(:@files, SimpleCov::FileList.new(files));
  sr 
}

results = sourcefiles.
  map { |file| SimpleCov::Result.from_hash(JSON.parse(File.read(file))) }.
  each do |result| add_files.(result); end; nil
result = SimpleCov::ResultMerger.merge_results(*results)
add_files.(result)
result.format!

@jgonera
Copy link
Author

jgonera commented Apr 17, 2017

@bf4 What I do now:

repo_dir = "/some/machine/specific/path"
repo_dir_escaped = repo_dir.gsub("/", "\\/")
resultset_glob = File.join(coverage_path, "**", ".resultset.json")
system("sed -i 's/#{repo_dir_escaped}\\///g' #{resultset_glob}")

This makes the paths relative. Then when I merge results on one of the machines I first Dir.chdir to where the source code is checked out, do SimpleCov.filters.clear (because I don't want to filter relative paths by SimpleCov.root) and then run the merging as described in #219 (comment).

This feels quite shaky, but that's another problem which seems to highlight that we might want SimpleCov to have a broader public API instead of relying on its internals (codeclimate/ruby-test-reporter#181 also seems to highlight this).

That being said, since the source files are read lazily
(the lines you linked to in source file) as long as you do not actually
access the source, you could generate results without hitting the file
system.

@colszowka They kind of are, but how does that matter if the loading always needs to happen? As I said, I'd like to know total coverage for a given SHA tested in my CI. Looking at the source code, this seems to be the case:

Instead of this happening, we could just get those line counts (covered_lines and missed_lines) from .resultset.json without loading anything.

I know that it is useful to show which lines are covered and which not, but that could be done only for files that I request (e.g. files that I know changed in a given commit/PR), without loading every single file just to get a metric of overall code coverage.

@PragTob PragTob added the Feature label Dec 3, 2019
@PragTob PragTob changed the title SimpleCov requires original source code for all stats Allow Overall Statistics Calculation without loading the files that are covered Dec 3, 2019
@PragTob
Copy link
Collaborator

PragTob commented Dec 9, 2019

It just occurred to me that the reason we are reading all files is because we need to look for the nocov comments to accurately report coverage.

If we were to implement this it'd probably need to be some separate method or what not that explicitly doesn't support the nocov commments, which makes this add quite some complexity. On the other hand, it'd just deal with raw coverage results which might be easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants