Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coverage misses lines #28192

Closed
iamed2 opened this issue Jul 19, 2018 · 40 comments
Closed

Coverage misses lines #28192

iamed2 opened this issue Jul 19, 2018 · 40 comments
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@iamed2
Copy link
Contributor

iamed2 commented Jul 19, 2018

For this file: https://github.com/invenia/Memento.jl/blob/master/src/stdlib.jl

The coverage report is:

        - if VERSION > v"0.7.0-DEV.2980"
        -     import Base.CoreLogging:
        -         AbstractLogger,
        -         handle_message,
        -         min_enabled_level,
        -         shouldlog,
        -         global_logger,
        -         Debug
        - 
        -     const LEVEL_MAP = Dict{AbstractString}
        -     struct CoreLogger <: AbstractLogger
        -     end
        - 
        -     min_enabled_level(logger::CoreLogger) = Debug
        4     shouldlog(logger::CoreLogger, arags...) = true
        - 
        -     function handle_message(::CoreLogger, cl_level, msg, mod, group, id, filepath, line; kwargs...)
        2         logger = getlogger(mod)
        -         level = lowercase(string(cl_level))
        2         log(logger, logger.record(logger.name, level, getlevels(logger)[level], msg))
        -     end
        - 
        -     function substitute!()
        1         global_logger(CoreLogger())
        -         notice(getlogger(@__MODULE__), "Substituting global logging with Memento")
        -     end
        - else
        -     function substitute!()
        -         warn(
        -             getlogger(@__MODULE__),
        -             "Global logging substitution is not support for julia $VERSION"
        -         )
        -     end
        - end
        - 

Notice that level = lowercase(string(cl_level)) is between two covered lines but somehow isn't reported as covered.

@vtjnash
Copy link
Member

vtjnash commented Jul 19, 2018

Now is perhaps a good time to update LineInfoNode to track line-number ranges in the parser (but might be hard still for the optimizer to handle updating this information). And see also https://github.com/JuliaLang/julia/pull/11802/files

Perhaps we should inject this information before optimization (like we are doing at the llvm level), as this'll problem will only get worse as our optimizer gets better. We could more-easily do it then at the basic-block level. This would be more likely to require recompiling the system image to switch the flag though.

@maxbennedich
Copy link
Contributor

We are seeing the same thing starting in Julia 0.7. Every Julia 0.7 and 1.0 project with test coverage I've seen suffer from the same thing. I've tried with --inline=no but it's still incorrect for us.

@IainNZ IainNZ added the bug Indicates an unexpected problem or unintended behavior label Sep 17, 2018
@navidcy
Copy link
Contributor

navidcy commented Oct 16, 2018

Any updates on this issue?
I haven't been able to use Coveralls or Codecov.io since v0.7. This was actually very useful in developing tests for my packages. Without that I have no way to find which lines of codes are actually covered. Is there a way to do that and I'm missing it out?

@sbromberger
Copy link
Contributor

I don't know whether this is related, but LightGraphs just dropped to from 99.8% to 66% codecov (and the reported coverage gaps don't make sense).

@matteoacrossi
Copy link

I have the same problem in one of my packages. It simply misses lines, as in the OP.
Any news on this?

@fingolfin
Copy link
Member

fingolfin commented Oct 31, 2018

To clarify, this issue is certainly amplified by recent changes to https://github.com/JuliaCI/Coverage.jl These are my "fault", although I "only" enabled code that was used on Julia 0.6 to also work correctly on Julia 0.7 and 1.0, which made the issue much more visible. Some discussions on mitigating the problem:

@sbromberger
Copy link
Contributor

sbromberger commented Dec 2, 2018

Hey folks, this is really getting frustrating. LightGraphs dropped from 100% to 68% code coverage several months ago, and I can't use the --inline=no trick because 1) it results in an inference error (that doesn't occur with inlining), and 2) travis times out.

Right now all our PRs are failing because we require codecov to be > 97%. I'd rather not change that, because it's an important metric. Also, it makes the project look bad to show a red "68%" coverage badge, especially when we've spent all this time getting coverage up to 100%.

What workarounds are available right now?

@DilumAluthge
Copy link
Member

DilumAluthge commented Dec 3, 2018 via email

@sbromberger
Copy link
Contributor

@DilumAluthge thanks. That might address one concern, but all PRs will still fail because we have a policy of only allowing PRs that have an aggregate codecov of 99+% and that do not drop the overall codecov below 97%.

@sbromberger
Copy link
Contributor

As an update, setting --inline=no and ignoring nightly failures reduces reported coverage:
sbromberger/LightGraphs.jl#1073

I don't know how to get proper reporting at this point.

@mauro3
Copy link
Contributor

mauro3 commented Dec 3, 2018

Maybe you needs something like https://github.com/auchenberg/volkswagen for the interim ;-)

@fingolfin
Copy link
Member

So one option would be to revert some of the Coverage.jl changes -- then dead / unused functions would not be reported as code, and you'd get your coverage up again (but at the cost of loosing the ability to find unused functions via coverage reports -- which to me is a major loss, but of course I realized this may be quite different for other people).

Another potential solution is described on JuliaCI/Coverage.jl#188 -- that PR is incomplete, though, but I do try to sketch how one could complete it. The core idea there is that a function which has zero lines with coverage likely is unused, so we just mark all its lines as unexecuted code. So far, Coverage.jl does that for every line of code in every function; but now, we would not do so anymore: Rather, any function which has at least one line of code that was executed at least once, clearly is not unused, so we don't mark any code lines it as unexecuted (even though it may contain lots of those, due to the inlining issue discussed here). The missing part in the PR is that one has to do this recursively, to treat functions which are nested in other functions or in modules correctly. That shouldn't be too hard, I reckon (famous last words), but I currently have other far more pressing matters to take care of; so if somebody wants to take a stab at completing that PR in the meantime, please don't wait for me (I'll try to be available for discussion or reviews, if desired, though).

@sbromberger
Copy link
Contributor

then dead / unused functions would not be reported as code, and you'd get your coverage up again

This isn't the problem, I don't think. We have 100% coverage of all functions in the tests. The problem is that something changed and the coverage is not being reported. Here's an example.

How is it that lines 26 and 28 both have coverage (3 separate times), but line 27 - which is not in the middle of any branch - has no coverage? I can tell you that this function is being tested here.

This has happened throughout the codebase and it all started a few months ago.

The core idea there is that a function with has zero lines with coverage likely is unused

Again, we don't have any of these to my knowledge. All functions are tested. We had a huge effort about a year and a half ago to ensure that all the code receives test coverage.

@sbromberger
Copy link
Contributor

sbromberger commented Dec 3, 2018

Someone mentioned that kwargs makes it difficult to troubleshoot this, so here's an example that may be more straightforward.

Coverage, and Test.

@maxbennedich
Copy link
Contributor

We never found a workaround. --inline=no didn't really make any difference. We gave up in the end and stopped tracking test coverage :(

@fingolfin
Copy link
Member

@sbromberger Let me explain a bit longer as to why I think that what I wrote is 100% relevant to the problem. I need to elaborate a bit for this, so please bear with me...

When Julia is aksed to generate coverage data, it outputs coverage files for every source file (see the description of this issue for an example). In there, it records (slightly simplified) for every line whether it contains code or not; if it contains code, it also records how often that line was executed, as a non-negative integer. The number 0 indicates a line is code, but was not executed (= dead code).

Now, what happens in practice is that Julia reports a lot of lines which you and me would clearly consider as "this is code" instead as "this is not code at all". The result is that these lines do not contribute for coverage. Now, this essentially happens in two cases:

  1. Dead functions: if a function is never called (and hence Julia never JITs it / never generates LLVM bitcode for it), then it is recorded as "not code"
  2. If code is inlined, it often (always?) is marked as "not code".

These two cases have ultimately (I think) the same origin (but really, it doesn't matter, so let's not squabble about it), but have very different effects:

  • case 1 is not affected by inlining being on or off at all
  • case 2 is affected by inlining, which is why using --inline=no usually is a good workaround (@maxbennedich it would be helpful to see an example where it *did not help; care to elaborate?)
  • case 1 is highly undesirable, because it makes it hard to find dead functions. Indeed, you could have 100 lines of active code and 1 million lines of dead code, and still get 100% coverage reported
  • case 2 is often a bit confusing, but usually far less problematic when it comes to coverage reporting; you usually can still tell if some segment of code is being executed or not.

Now, what happened until recently is that Coverage.jl for Julia 0.7 and 1.0 simply passed on the reported coverage data to Codecov. This resulted in reports like this one where you see that line 27 clearly contains code, but the report says it is not code. While in this newer report, line 27 is marked as "code that was not executed".

But for Julia 0.6, Coverage.jl had (has) a function amend_coverage_from_src! which, as its name suggests, tries to "improve" the coverage data, in an attempt to fix case 1. What recently changed is that I fixed this function to also work in Julia 0.7 / 1.0. And this in turn, combined with the inlining problems reported in this issue here, causes the undesirable Coverage drops you and others are experiencing. Relevant is specifically my merged PR JuliaCI/Coverage.jl#179, which fixed various compatibility issues with Julia 0.7 resp. 1.0.

What does amend_coverage_from_src! do? It tries to amend the coverage report, by looking for lines which are code, but which are not marked as code. Unfortunately, this function is too aggressive, as it marks all such lines as dead code -- even those which are due to case 2 above (or, depending on whom one wants to blame, Julia is not reporting inlined code lines correctly, which e.g. gcov for C/C++ code manags -- that's what this issue here on Julia repository is about).

Hence my proposed fix: teach amend_coverage_from_src! to be less aggressiv and distinguish between case 1 and case 2 above via a heuristic: if a function contains at least one line that is marked as code, then don't "amend" its coverage data. I am pretty sure this would fix all your coverage problems -- though of course ultimately one has to try it to be sure. This is what
JuliaCI/Coverage.jl#188 starts, but as I wrote above, it is not quite complete yet, as now it again amends too little code; but as described, this could be greatly improved with some effort.

@sbromberger
Copy link
Contributor

sbromberger commented Dec 3, 2018

Thanks, @fingolfin - that was really helpful.

If it helps, --inline=no reduced our code coverage as well in LightGraphs. (See sbromberger/LightGraphs.jl#1073 for the report). Maybe that will give you additional insight?

Also, I'm confused:

But for Julia 0.6, Coverage.jl had (has) a function amend_coverage_from_src! which, as its name suggests, tries to "improve" the coverage data, in an attempt to fix case 1. What recently changed is that I fixed this function to also work in Julia 0.7 / 1.0.

But we had ~100% code coverage in 0.6 as well, so whatever happened wasn't a 1:1 reimplementation for 0.7/1.0.

@fingolfin
Copy link
Member

In Julia 0.6, this inlining issue did not exist (AFAIU)

@maxbennedich
Copy link
Contributor

case 2 is affected by inlining, which is why using --inline=no usually is a good workaround (@maxbennedich it would be helpful to see an example where it *did not help; care to elaborate?)

Hmm, in our experiment it didn't make any difference whatsoever. Did we miss something? Here's the commit.

And here's the resulting coverage report.

"COVERAGE REMAINED THE SAME AT 87.328%"

@fredrikekre
Copy link
Member

Hmm, in our experiment it didn't make any difference whatsoever. Did we miss something?

--inline is not transferred to the process that Pkg.test starts in Julia 1.0.2.

@vtjnash
Copy link
Member

vtjnash commented Dec 3, 2018

I think #30251 should much improve case 2 for --inline=yes|default.

@maleadt
Copy link
Member

maleadt commented Dec 5, 2018

Tried out #30251 with LLVM.jl: coverage on 1.0.2 is 70% down from 97% before the coveragepocalyps, on 6594acc it's back to 86% (note sure why clicking through to the files is broken, that's recent).

@fingolfin
Copy link
Member

@maleadt Thanks, that's useful. I picked a file "at random" to get an idea how often case 1 (code is inlined and incorrectly marked as unexecuted code) vs case 2 (actual dead code) is. Specifically, I chose src/core/instructions.jl, which dropped from 93% coverage to still just 75% coverage with #30251.

Looking at it, we see that for example lines 43+44, which contain

predicate_int(inst::Instruction) = API.LLVMGetICmpPredicate(ref(inst))
predicate_real(inst::Instruction) = API.LLVMGetFCmpPredicate(ref(inst))

are marked as "not code" in the former, but are marked as "dead code" in the latter. I can't tell which one is correct, though, but I assume you can tell us?

If it is actual dead code, then this is case 2 in my description above, and so in a sense an improvement. If it is code that is not dead, then it means these functions were simply completely inlined. Unfortunately, this then is a situation which the heuristic I described as potential workaround above (and also see JuliaCI/Coverage.jl#188) won't be able to deal with. And then I don't see anything that Coverage.jl could do to disambiguate the two cases. So I guess if this is happening, then the only fix for it is to stop using amend_coverage_from_src! in Coverage.jl, and live with being unable to detect dead functions in coverage reports :/. The only way to change I can think of right now then would be for Julia to generate "better" coverage reports which somehow allow disambiguating dead functions from completely inlined one.

@fingolfin
Copy link
Member

According to https://github.com/maleadt/LLVM.jl/search?q=predicate_int&unscoped_q=predicate_int there is indeed no place calling predicate_int, so I still have some hopes, but I'd rather have @maleadt confirm this.

@maleadt
Copy link
Member

maleadt commented Dec 5, 2018

Thanks for the analysis, that is dead (well, untested) code indeed so definitely an improvement.
Looking through some other files, coverage statistics seem very reasonable now, and much more accurate wrt. dead code.

@vtjnash
Copy link
Member

vtjnash commented Dec 5, 2018

I found that amend_coverage missed a function (https://codecov.io/gh/maleadt/LLVM.jl/src/752aae1a6f86d17877633c6cea70a529a1ad05a4/src/passmanager.jl#L5), but that everything reported seemed correct.

@tkoolen
Copy link
Contributor

tkoolen commented Dec 5, 2018

Awesome, thanks @vtjnash!

@fingolfin
Copy link
Member

Thanks @vtjnash ! Hmm, I wonder a bit about function define_transforms in https://codecov.io/gh/maleadt/LLVM.jl/src/752aae1a6f86d17877633c6cea70a529a1ad05a4/src/transform.jl -- it has line 82 marked as executed code, the rest as code that was not executed. How can that be? Perhaps because it's inside an @eval block?

@maleadt
Copy link
Member

maleadt commented Dec 5, 2018

How can that be? Perhaps because it's inside an @eval block?

That global code is executed during precompilation, in a different process, which doesn't get passed the coverage flag. The compiler also doesn't seem ready to handle this. But even then, these kind of codes are typically executed by the interpreter, which doesn't seem to record coverage at all.

@vtjnash
Copy link
Member

vtjnash commented Dec 22, 2018

@maleadt In #30381, I've configured julia_cmd() to pass more variables (they probably should have been environment variables, but I guess that's not possible now), and hacked around that codegen issue. Someday hopefully our compiler will be able to handle codegen more gracefully

@maleadt
Copy link
Member

maleadt commented Dec 22, 2018

Nice, thanks! The define_transforms function @fingolfin spotted above is now also properly covered.

@Wikunia
Copy link
Contributor

Wikunia commented Feb 24, 2019

Is there any solution now for julia 1 and 1.1?
Have the same problem with Juniper.jl.
This for example doesn't make sense:
https://codecov.io/gh/lanl-ansi/Juniper.jl/src/3c2634ed82d430871cef9a292926ee4fcc7cf21c/src/solver.jl#L9...26
Is inline=no implemented for julia v1? If yes how can I use it?
Currently running using Pkg; Pkg.clone(pwd()); Pkg.test("Juniper", coverage=true)

@bramtayl
Copy link
Contributor

bramtayl commented Mar 8, 2019

I haven't understood the extent of this problem until recently but this seems like Real Bad (TM)

@bramtayl
Copy link
Contributor

Can this get a milestone?

@KristofferC
Copy link
Member

What milestone and why?

@bramtayl
Copy link
Contributor

Any milestone (I'm guessing it would have to be 2.0 because if I'm understanding correctly, the solution is to update LineNumberNode to Vector{LineNumberNode}) because code coverage is a critical "sanity check" for testing and writing stable/functional code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests