-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for branch coverage in source-based coverage #79649
Comments
cc @richkadel you might want to include it in #79121 |
Actually, at last I checked, this is not in any stable version of LLVM yet. I think it's feasible to stabilize Rust coverage in rustc, based on the current feature of LLVM InstrProf, without depending on proposed future LLVM features. So, IMO, I don't think we should add this to the tracking issue, as a dependency. In fact, until this capability is stabilized in LLVM, and available in a version of LLVM that's supported by rustc, I think it makes sense to keep this as a stand-alone feature request / enhancement. Let me know if any of my assumptions are incorrect. |
Cc @tmandry |
I have no idea how adding features like that works, I thought it will be available right away in LLVM 12 around March. |
Neat. This would certainly improve the way we present coverage for these things, and might make it possible to be more intuitive about how we handle closing braces. Just to throw it out there, since I'm thinking about async/await right now: This does offer an intriguing possibility of measuring drops at await points. It's possible for buggy code to assume it will never get canceled at an await point, when in fact that's part of the contract. It would be nice if we could highlight this in coverage results. Though explicitly testing cancelation at every await point sounds daunting... It should be there in the next major version of LLVM, whenever that is. |
Looks like this feature was released in LLVM 12.0.0: 14.0.0 was released 2022 March. Similar to the approved suggestion in other issues, where it was decided that an LLVM 11 feature could be used once LLVM 13 was released, could the requirement for code coverage working now be LLVM 12? I see the original umbrella tracking issue for standardized Rust coverage options still has a few issues open, but it otherwise looks stable and settled. |
I should have posted this comment a while ago. (It was discussed when upgrading the coverage mapping format to version 6, in PR #91207). Before anyone puts much effort into implementing LLVM "branch coverage", I recommend writing a test program that demonstrates where the existing Rust coverage region report would be improved with branch coverage. It's possible that branch coverage adds no value. Rusts existing coverage regions include counters for both sides of a branch condition, even when implied. For example, an
So from this you can already tell that (I think Clang's coverage may not have reported the implied Adding branch coverage support to Rust may or may not be worth the effort, but if someone wants to implement it, I'm happy to provide some guidance or answer questions. |
I think the main benefit of branch coverage, and what I as a potential user of it would like to see, is that it will point directly to the source region corresponding to the branching condition. Implementation (MIR) wise, all the counters are there. https://rustc-dev-guide.rust-lang.org/llvm-coverage-instrumentation.html is very detailed about how this all works. And in theory, implementing branch coverage would consist of introducing new counters that just "reference" existing counters for the true/false branch. But to repeat, I think this is rather a tooling and DX thing. Having a "zero-width" marker after the closing brace that happens to have |
Here is a possible example, I created exper-code-coverage, where I was comparing cargo-tarpaulin and cargo-llvm-cov crates. I created
If I only include the first test,
But, if I only include the last test I get less than 100% for
|
I've now dug a little deeper and used Based on this I conclude that this example provides an indication that "branch coverage" should be an improvement. DetailsHere is the asm code for
And here it is for
I piped these to
I then created the
Wondering what happened to it, I looked at the generated code and there are two
|
Specifically I added reference to rust issue 79649, rust-lang/rust#79649 and a couple comments I added.
## [why] Coverage calculations using the past `grcov` method has become increasingly ... odd, with strange lapses not obviously related to code changes. More recently, Rust (and `grcov`) have been promoting a new method for code coverage calculation, called instrumented (or source-based). Discussion about instrumented coverage has now been stabilized and included in [_The rustc book_][^1]. Unfortunately, the current incarnation of instrumented coverage does not support branch coverage calculation. It's, at best, a work-in-progress, possibly a works-as-designed, with, currently, a lack of strong support for adding branch support. "Region" coverage calculation is instead being discussed as a substitute for branch calculations. Ultimately, given increasing flakiness with the current method, a more robust and accurate method of line coverage is worth the loss of information arising from lack of branch coverage, which is of lower importance for most development. ### refs [^1]: https://doc.rust-lang.org/rustc/instrument-coverage.html - [Rust ~ Instrumentation-based coverage](https://doc.rust-lang.org/rustc/instrument-coverage.html) @@ <https://archive.is/ERWrk> - [HowTo collect Rust source-based coverage](https://marco-c.github.io/2020/11/24/rust-source-based-code-coverage.html) @@ <https://archive.is/X9R14> - [`grcov` issue ~ missing branch coverage](mozilla/grcov#520) - [Rust issue ~ add support for branch coverage](rust-lang/rust#79649)
## [why] Coverage calculations using the past `grcov` method has become increasingly ... odd, with strange lapses not obviously related to code changes. More recently, Rust (and `grcov`) have been promoting a new method for code coverage calculation, called instrumented (or source-based). Discussion about instrumented coverage has now been stabilized and included in [_The rustc book_][^1]. Unfortunately, the current incarnation of instrumented coverage does not support branch coverage calculation. It's, at best, a work-in-progress, possibly a works-as-designed, with, currently, a lack of strong support for adding branch support. "Region" coverage calculation is instead being discussed as a substitute for branch calculations. Ultimately, given increasing flakiness with the current method, a more robust and accurate method of line coverage is worth the loss of information arising from lack of branch coverage, which is of lower importance for most development. ### refs [^1]: https://doc.rust-lang.org/rustc/instrument-coverage.html - [Rust ~ Instrumentation-based coverage](https://doc.rust-lang.org/rustc/instrument-coverage.html) @@ <https://archive.is/ERWrk> - [HowTo collect Rust source-based coverage](https://marco-c.github.io/2020/11/24/rust-source-based-code-coverage.html) @@ <https://archive.is/X9R14> - [`grcov` issue ~ missing branch coverage](mozilla/grcov#520) - [Rust issue ~ add support for branch coverage](rust-lang/rust#79649)
## [why] Coverage calculations using the past `grcov` method has become increasingly ... odd, with strange lapses not obviously related to code changes. More recently, Rust (and `grcov`) have been promoting a new method for code coverage calculation, called instrumented (or source-based). Discussion about instrumented coverage has now been stabilized and included in [_The rustc book_][^1]. Unfortunately, the current incarnation of instrumented coverage does not support branch coverage calculation. It's, at best, a work-in-progress, possibly a works-as-designed, with, currently, a lack of strong support for adding branch support. "Region" coverage calculation is instead being discussed as a substitute for branch calculations. Ultimately, given increasing flakiness with the current method, a more robust and accurate method of line coverage is worth the loss of information arising from lack of branch coverage, which is of lower importance for most development. ### refs [^1]: https://doc.rust-lang.org/rustc/instrument-coverage.html - [Rust ~ Instrumentation-based coverage](https://doc.rust-lang.org/rustc/instrument-coverage.html) @@ <https://archive.is/ERWrk> - [HowTo collect Rust source-based coverage](https://marco-c.github.io/2020/11/24/rust-source-based-code-coverage.html) @@ <https://archive.is/X9R14> - [`grcov` issue ~ missing branch coverage](mozilla/grcov#520) - [Rust issue ~ add support for branch coverage](rust-lang/rust#79649)
A few months ago I looked into working on this, but I found it a bit tricky to fit branch coverage handling into the existing code, which is why I focused my attention on refactoring the existing coverage code instead. When my current work on rust-lang/compiler-team#645 eventually gets approved and merged, hopefully that will make it a bit easier to incorporate things like branch regions. |
One of the questions that will need to be answered when implementing this is whether branch coverage regions should be identified using heuristics during MIR analysis (which is how existing coverage regions are identified), or identified during HIR-to-MIR lowering (with access to precise information about original syntax forms like Identifying branches during HIR-to-MIR lowering should give more accurate results, and should be more straightforward and maintainable in the long run. But it would also require additional up-front effort to figure out how to add partial coverage analysis during lowering, transmit its conclusions through to the MIR analysis, and have that analysis take them into account. |
As #116046 gets closer to being a reality, I've been thinking more about possible implementation strategies for branch coverage. From the time I've spent looking at the existing coverage code, I strongly suspect that it's not possible to reach an acceptable implementation of branch coverage through MIR analysis alone. Trying to reverse-engineer if/then/else arms from MIR So the approach I'm currently thinking about (but haven't yet tested) is as follows:
|
I agree that introducing branch markers during (T)HIR lowering gives us the most control over what exact expressions are being considered, and their precise source spans. |
Hello @Zalathar, I saw that you recently pushed a lot of refactorings, to prepare the integration of branch coverage in the compiler I suppose? Do you have a time frame in mind? Do you already have patches regarding branch coverage? We would obviously be interested in collaborating |
@Jugst3r I upstreamed MC/DC support (and branch coverage) into Clang/LLVM and would love to be in the loop on this (and it's already being extended and improved). I'm not an expert on Rust, but I definitely want to follow what y'all are doing. That's great to hear there is interest in MC/DC. Please let me know if you have any questions. |
Sure @evodius96, we will definitely let you know of how our experiments go. I have tracked the LLVM work and I am aware of some of the improvements (notably the one lifting the 6 condition limits for MC/DC), but only discovered yesterday about the nice work @Zalathar did. To give you more context, I mainly have expertise on coverage (as I work on a tool that does coverage for Ada and C/C++ code that supports up to the MC/DC level and that is called gnatcoverage), but very little on the Rust compiler. The MC/DC implementation in GNATcoverage is robust, and has been used for many years by the customers. I can give input regarding the implementation of MC/DC in GNATcoverage if you need any by the way Imo, we also have better support when it comes to coverage reporting and source code obligations inside macro expansion in comparison to clang, but it would probably be more appropriate to discuss elsewhere. |
@Jugst3r @evodius96 We'd be happy to collaborate and share more details and help in accelerating the effort for bringing MC/DC to Rust. |
Hi @Ax9DTW. I would be happy to have a look at it. Do you have a draft PR or a repository for the implementation? IIUC, the primary difference between MC/DC coverage and branch coverage is the ability to identify decision as a whole. It is interesting as branch coverage needs to identify decisions, to make branch regions, but it does not need to preserve the whole decision valuation, which you need for MC/DC. So I suppose that requires some more metadata information to be added, similar to the branch markers that @Zalathar implemented? |
Branch coverage instrumentation has landed in #122322, and should be available in the next nightly. It can be enabled by including both You might also need to run The current implementation supports conditions in |
@ZhuUx Thanks for sharing! For your information, I work with @RenjiSann at AdaCore so I was aware of your work. For your information, GNATcoverage that is used in certification contexts applies MC/DC to all complex boolean logical expressions in additional to boolean expressions that are controlling of a control flow structure. By complex, I mean logical boolean expressions with a binary and / or operator. |
Found something confusing about the behavior of branch coverage to plain logical expression. Here is the code #![feature(coverage_attribute)]
fn bar(a: bool, b: bool) {
let x = a || b;
println!("evaluated {}", x);
}
#[coverage(off)]
fn main() {
bar(true, false);
bar(false, false);
} The current implementation reports coverage of |
Yes, this is the expected behaviour for branch coverage. In this example, But |
This might lead to something a bit weird. fn direct(a: bool, b: bool) {
if a || b {
println!("pass");
}
}
fn indirect(a: bool, b: bool) {
let v = a || b;
if v {
println!("pass");
}
}
fn main() {
direct(true, false);
direct(false, false);
indirect(true, false);
indirect(false, false);
} Current implementation reports 100% branch coverage for Clang adopts the way where both a and b are taken as branches in |
It seems to me that |
I think that both should be considered as branches, or at least considered as conditions in MC/DC. IIUC MC/DC is implemented on top of branch coverage, so if we consider them as conditions in MC/DC terminology but not as branches, it will introduce a divergence between the implementation of branch coverage vs MC/DC coverage which is not desirable in my opinion |
Regarding the behavior in the Furthermore, MCDC coverage, which is based on branch coverage, is defined by certification standards, and their stance on this subject is that Finally, I add that when replicating this with C++ and Clang, |
Remember that branch coverage isn't just a building-block for MC/DC, and it isn't only for people writing certified software. I agree that tracking whether But I'm not convinced that it should simply occur by default when branch coverage is enabled. |
I'm surprised to hear that clang treats |
Effectively, this is true for clang -- the goal was to provide a consistent metric for all conditions in an expression in the source code, though technically you're correct that condition (Actually for clang, a synthetic control-flow branch is inserted for the last condition on all boolean expressions in order to ensure a final count, even though this is obviously not necessary, given that, for example, in an |
I don't see contradiction in the expression
Based on this, I take line "in boolean expression second argument won't be calculated if first is true" with a grain of salt as a mind shortcut. Thus, I don't expect counting to work the same way in every single situation, as situations are actually different even visually they're the same. PS: braces added for clarity. |
The instrument behavior for branches containing The question is: how to determine false branches in pattern
The first opinion might be intuitive. However, one can argue that there is a key difference between or patterns and boolean or expressions: when we write
For now I have implemented the first proposal at #124278 but tend to the second in mcdc. And normal branch coverage may need to consider this too so I present it here to collect some advice or other views. |
For branch coverage (or at least the basic level of branch coverage), my current thinking is that Of course, that wouldn't be appropriate for MC/DC coverage (or a more detailed level of branch coverage), where users would probably need more detail in order to satisfy their testing/documentation requirements. I vaguely remember hearing that someone had been doing some work on defining precisely what MC/DC should mean in the context of Rust (with things like pattern-matching and or-patterns), but I don't have any of the details at hand. |
…errors coverage: Replace boolean options with a `CoverageLevel` enum After rust-lang#123409, and some discussion at rust-lang#79649 (comment) and rust-lang#124120, it became clear to me that we should have a unified concept of “coverage level”, instead of having several separate boolean flags that aren't actually independent. This PR therefore introduces a `CoverageLevel` enum, to replace the existing boolean flags for `branch` and `mcdc`. The `no-branch` value (for `-Zcoverage-options`) has been renamed to `block`, instructing the compiler to only instrument for block coverage, with no branch coverage or MD/DC instrumentation. `@rustbot` label +A-code-coverage cc `@ZhuUx` `@Lambdaris` `@RenjiSann`
I agree here, we shouldn't treat or patterns like short-circuiting boolean expressions. The way I'd see them is like two separate decisions to reach the same branch.
I think for branch coverage @Zalathar's approach is the best, while for MC/DC I'd go with @ZhuUx's second proposal.
We (Ferrous Systems) wrote that paper with DLR. It's still not published (we're submitting it to an academic conference), but if you need a copy of the pre-print for the implementation of branch or MC/DC coverage reach out at pietro.albini@ferrous-systems.com and I'll happily send you a copy! |
coverage: Replace boolean options with a `CoverageLevel` enum After #123409, and some discussion at rust-lang/rust#79649 (comment) and #124120, it became clear to me that we should have a unified concept of “coverage level”, instead of having several separate boolean flags that aren't actually independent. This PR therefore introduces a `CoverageLevel` enum, to replace the existing boolean flags for `branch` and `mcdc`. The `no-branch` value (for `-Zcoverage-options`) has been renamed to `block`, instructing the compiler to only instrument for block coverage, with no branch coverage or MD/DC instrumentation. `@rustbot` label +A-code-coverage cc `@ZhuUx` `@Lambdaris` `@RenjiSann`
@evodius96
If the condition id of |
For MC/DC, the order doesn't matter, but the order has to be consistent such that once a node gets an ID, it doesn't change. During llvm-cov visualization, where the conditions are counted according to their left-to-right ordinal position, a mapping is maintained to track a condition's ordinal position and the ID it has. In clang, the ID is assigned by the node visitor during a recursive AST walk (See The |
Ah Exactly that's what happened if the id of entry condition is not 1. Assertions fail at different locations between llvm 18 and 19. void buildTestVector(MCDCRecord::TestVector &TV, unsigned ID = 1) {
shouldCopyOffTestVectorForTruePath(TV, ID);
shouldCopyOffTestVectorForFalsePath(TV, ID);
// Reset back to DontCare.
TV[ID - 1] = MCDCRecord::MCDC_DontCare;
} It constructs all possible test vectors and is called first at As for llvm 19, llvm-cov would panic at Hence there is an implicit rule suggested by implementation that the entry condition of decision should be fixed with condition id 1. I think this is relatively fair so we (or any other compilers implementing mcdc based on llvm) would better respect it. At least it is not difficult for rust to find a way to regulate the condition id assignment of pattern matching. |
Similarly to what is being done in LLVM and Clang: https://reviews.llvm.org/D84467.
The text was updated successfully, but these errors were encountered: