Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve quality of coverage reports by parsing source files for functions #36

Merged
merged 8 commits into from
Jan 6, 2015
Merged

Improve quality of coverage reports by parsing source files for functions #36

merged 8 commits into from
Jan 6, 2015

Conversation

timholy
Copy link
Member

@timholy timholy commented Jan 5, 2015

This alters---and perhaps improves---our ability to quantify coverage. It leverages @jakebolewski's JuliaParser.jl to extract the beginning/end line range of "simple" function expressions. It then amends the coverage information by inserting 0 instead of nothing for functions that were never compiled.

This is not perfect. Here are deficits that I'm aware of currently:

  • It misses functions defined inside a for f in (:min, :max, ...) @eval begin... style block (this error will contribute to overly-optimistic estimates of coverage)
  • It falsely penalizes end statements and the initial function f(...) line in a multi-line function (this error will contribute to overly-pessimistic estimates of coverage)

There is also the genuine risk of the line count getting "out of sync", since the logic for extracting line numbers is not entirely trivial. I haven't seen it yet, but frankly I've only tested against a single source file.

I suspect both of these deficits may be fixable, so I am not certain this is yet worth merging. OTOH, I am not entirely certain I want to dump a huge amount of effort into this; if this is viewed as a step forward, perhaps we should go for it. But thanks to Jake it was surprisingly easy to get this far; maybe it wouldn't be so much harder to fix one or both of these problems (but I don't have any good ideas yet).

@jakebolewski
Copy link

You would also have to turn off inlining, but this might be a step forward. At least it would give a more honest percentage of coverage results.

@timholy
Copy link
Member Author

timholy commented Jan 6, 2015

We can just do that with --inline=no now.

@IainNZ
Copy link
Contributor

IainNZ commented Jan 6, 2015

@timholy, with the second bullet point, does that apply to only functions that were not caught before, or does it apply to all functions?

@IainNZ
Copy link
Contributor

IainNZ commented Jan 6, 2015

I'd like to try this out on JuMP, it seems like a great step forward.

@timholy
Copy link
Member Author

timholy commented Jan 6, 2015

Looking at it a bit more, unfortunately the line numbering is borked. I think this needs to be fixed in JuliaParser.

@timholy
Copy link
Member Author

timholy commented Jan 6, 2015

See JuliaLang/JuliaParser.jl#13

@timholy
Copy link
Member Author

timholy commented Jan 6, 2015

OK, I think I've fixed both of the problems above. I hacked around the line number problem by first finding the file position corresponding to the beginning of each new source line, and then using that to keep track of where I am in the file. And it seems that with proper line number accounting, my concerns about end, etc, went away.

@timholy
Copy link
Member Author

timholy commented Jan 6, 2015

One remaining problem is: what to do about whole files that lack tests? In that case, there is no *.cov file.

EDIT: I decided to include them in the coverage results

This also limits testing to *.jl files, and disables the `rethrow`
in case of error. The problem is that if the user has a broken
*.jl file sitting in that directory, JuliaParser will complain
about it.
@timholy
Copy link
Member Author

timholy commented Jan 6, 2015

AFAICT, this will never pass travis; it seems that Pkg uses the old REQUIRE file to install dependencies, so JuliaParser does not get installed. Consequently, this PR fails its tests. You can, of course, try it locally (it works for me).

@aviks
Copy link

aviks commented Jan 6, 2015

+1000

This is a vastly better state of affairs than previously. I'd stopped looking at coverage results, since adding a single test gave you a 90% coverage statistics rate.

@hayd
Copy link
Contributor

hayd commented Jan 6, 2015

@timholy couldn't you (as a bandaid) Pkg.add JuliaParser in the .travis.yml.

This is going to be ace!!

The for f in (:min, :max, ...) @eval begin... issue seems similar to coverage not testing for different datatypes (is coverage just the wrong metric?)...

@timholy
Copy link
Member Author

timholy commented Jan 6, 2015

@hayd, once this gets merged that change wouldn't be necessary anymore. So I'm not sure it's worth it.

Yes, there absolutely are subtleties (or really, not-so-subtleties...) that this does not handle well. I don't currently have any great ideas about how to take the next step, though. It seems you want some kind of summary number.

@jakebolewski
Copy link

Getting good coverage results, or any notion of "coverage percentage", is going to be difficult in any language that does runtime code generation. I think we just have to look at this as a flawed, but useful metric. Even Stefan's suggestion, instrumenting basic blocks, only will work for code that has actually been generated. This is complicated by things that generate code at runtime, an example being stagedfunctions.

@timholy, I don't know what packages you have used but maybe we should agree to a non-trivial one to test on. Glancing at the code, doesn't this have issues with functions that are not at the toplevel of the file.

ex.

begin
    func() = ...
end

or

module Foo
    module Bar
        func() = .....
    end
end

I'll look into making JuliaParser easier to use for this application. There have been some regressions in terms of compatibility with base during the 0.4 dev period.

@timholy
Copy link
Member Author

timholy commented Jan 6, 2015

My earlier worries about non-toplevel functions were erased by later commits---basically I think this is perfect 😄. Aside from generated code, the one exception I know of is this construction:

@doc """
This is the best function I ever wrote.
""" ->
f(x) = 2x

The reason is that because of the ->, the line number for that expression does not appear inside the function body, in contrast with a function defined just as

f(x) = 2x

@timholy
Copy link
Member Author

timholy commented Jan 6, 2015

Oh, regarding your question about testing on a package; my email to julia-dev tested two "mystery" packages, but both have a nontrivial amount of code (>2k lines each, one was over 4k).

@timholy timholy changed the title RFC: improve quality of coverage reports by parsing source files for functions Improve quality of coverage reports by parsing source files for functions Jan 6, 2015
@timholy
Copy link
Member Author

timholy commented Jan 6, 2015

BTW, I've removed the "RFC"; I'm not planning on pushing any more patches. I do wonder about hooking this into PackageEval and seeing if we can come up with coverage fractions for all registered packages. That could happen either before or after merging this.

@IainNZ
Copy link
Contributor

IainNZ commented Jan 6, 2015

LGTM

IainNZ added a commit that referenced this pull request Jan 6, 2015
Improve quality of coverage reports by parsing source files for functions
@IainNZ IainNZ merged commit 4057b79 into JuliaCI:master Jan 6, 2015
@IainNZ
Copy link
Contributor

IainNZ commented Jan 6, 2015

I think getting coverage numbers in PackageEval is definitely possible.

@IainNZ
Copy link
Contributor

IainNZ commented Jan 6, 2015

@timholy can you tag a new version of Coverage when you are ready? I have limited internet access

@timholy
Copy link
Member Author

timholy commented Jan 7, 2015

Hmm, I just did a push on Images, and indeed it modified my coverage stats: https://coveralls.io/builds/1701691. (Alternative: JuliaImages/Images.jl#243 (comment)) I expect it to go down, but not to 0%. Seems like something is wrong.

The part I can't understand is that the first element in that array of Dicts looks like this:
coverage

which seems entirely reasonable. Any insights about why that file nevertheless is scored as having 0% coverage on Coveralls?

@timholy
Copy link
Member Author

timholy commented Jan 7, 2015

@timholy
Copy link
Member Author

timholy commented Jan 7, 2015

Oh, actually, Images has had broken coverage data for a long time: the last good one I found dated from August 21, although there must have been a more recent one because the badge is a different number.

https://coveralls.io/r/timholy/Images.jl?page=14

Dunno what's going on, anyone have a clue?

@timholy
Copy link
Member Author

timholy commented Jan 7, 2015

OK, I'm getting reports for HDF5, so clearly it's something specific to Images. It would be interesting to debug, but I'm not sure how to go about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants