Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line coverage reports #2609

Merged
merged 94 commits into from
Jul 28, 2023
Merged

Conversation

adpaco-aws
Copy link
Contributor

@adpaco-aws adpaco-aws commented Jul 19, 2023

Description of changes:

Adds a coverage report feature to Kani that does not require cbmc-viewer.

In particular, it introduces the unstable --coverage verification option to kani-driver (only usable with the -Z line-coverage unstable feature), which in turn enables the --coverage-checks option in kani-compiler. This results in the injection of coverage checks (similar to cover statements, but with a fixed condition) before each Rust MIR statement and terminator in a basic block. The post-processing of coverage checks allows us to report coverage information at the line level.

This contribution also includes a new compiletest suite developed by @jaisnan for end-to-end testing of the emitted coverage information for different test cases.

The command is:

kani file.rs --coverage -Z line-coverage

For instance, for the assume_false.rs example

#[kani::proof]
fn wrong_coverage_2() {
    let a: u8 = kani::any();
    kani::assume(false);
    assert!(a < 5);
}

we obtain the verification results:

RESULTS:
Check 1: wrong_coverage_2.assertion.1
         - Status: UNREACHABLE
         - Description: "assertion failed: a < 5"
         - Location: assume_false.rs:5:5 in function wrong_coverage_2

and the following coverage results:

Coverage Results:
/home/ubuntu/kani/library/kani/src/lib.rs, 133, FULL
/home/ubuntu/kani/library/kani/src/lib.rs, 134, FULL
/home/ubuntu/kani/library/kani/src/lib.rs, 176, FULL
/home/ubuntu/kani/library/kani/src/lib.rs, 177, FULL

/home/ubuntu/kani/assume_false.rs, 3, FULL
/home/ubuntu/kani/assume_false.rs, 4, FULL
/home/ubuntu/kani/assume_false.rs, 5, NONE
/home/ubuntu/kani/assume_false.rs, 6, NONE

/home/ubuntu/kani/library/kani/src/arbitrary.rs, 55, FULL

Resolved issues:

Resolves #2585
Resolves #2608
Resolves #2610

Related RFC:

#2612

Call-outs:

Testing:

  • How is this change tested? Existing tests + new coverage suite.

  • Is this a refactor change? No.

Checklist

  • Each commit message has a non-empty body, explaining why the change was made
  • Methods or procedures are documented
  • Regression or unit tests are included, or existing tests cover the modified code
  • My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

jaisnan and others added 30 commits July 13, 2023 18:22
Copy link
Contributor

@celinval celinval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to polish the coverage output to make it more intuitive, but this is a good start.

Please create issues to:

  1. Improve the tests
  2. Create a user friendly output
  3. Export results to a well-supported output
  4. Investigate if we can make some results more intuitive (maybe compare against Rust code coverage results).
  5. All other open questions in the RFC

kani-driver/src/cbmc_output_parser.rs Outdated Show resolved Hide resolved
kani-driver/src/cbmc_property_renderer.rs Outdated Show resolved Hide resolved
kani-driver/src/cbmc_property_renderer.rs Outdated Show resolved Hide resolved
kani-driver/src/cbmc_property_renderer.rs Outdated Show resolved Hide resolved
tests/coverage/reachable/assert/reachable_pass/test.rs Outdated Show resolved Hide resolved
tests/coverage/reachable/bounds/reachable_fail/test.rs Outdated Show resolved Hide resolved
tools/compiletest/src/common.rs Show resolved Hide resolved
Copy link
Contributor

@zhassan-aws zhassan-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few redundant tests remaining. Otherwise, looks good.

tests/coverage/unreachable/vectors/main.rs Show resolved Hide resolved
tests/coverage/unreachable/variant/main.rs Show resolved Hide resolved
tests/coverage/unreachable/turned-off/test.rs Outdated Show resolved Hide resolved
tests/coverage/unreachable/rem-zero/test.rs Outdated Show resolved Hide resolved
tests/coverage/unreachable/overflow/test.rs Outdated Show resolved Hide resolved
tests/coverage/unreachable/overflow-neg/test.rs Outdated Show resolved Hide resolved
tests/coverage/unreachable/div-zero/test.rs Outdated Show resolved Hide resolved
tests/coverage/unreachable/debug-assert-ne/test.rs Outdated Show resolved Hide resolved
@adpaco-aws
Copy link
Contributor Author

Created the following issues:

which I'm hoping take care of the remaining work here and in the RFC (#2618)

There are a few redundant tests remaining

Indeed, there were many tests that were redundant or didn't have a clear purpose. I removed a bunch of them and have #2635 for another pass.

@adpaco-aws adpaco-aws merged commit db42ee9 into model-checking:main Jul 28, 2023
@celinval
Copy link
Contributor

@adpaco-aws should we create a milestone for those?

@adpaco-aws
Copy link
Contributor Author

Yes, just created a Coverage reports milestone to keep track of all of these. I create a new one instead of using the Source-based coverage report generation milestone because that one contains issues with the visualize feature, and it's not clear to me that we should mix both.

github-merge-queue bot pushed a commit that referenced this pull request Aug 27, 2024
This PR replaces the line-based coverage instrumentation we introduced
in #2609 with the standard source-based code coverage instrumentation
performed by the Rust compiler.

As a result, we now insert code coverage checks in the
`StatementKind::Coverage(..)` statements produced by the Rust compiler
during compilation. These checks include coverage-relevant
information[^note-internal] such as the coverage counter/expression they
represent [^note-instrument]. Both the coverage metadata (`kanimap`) and
coverage results (`kaniraw`) are saved into files after the verification
stage.

Unfortunately, we currently have a chicken-egg problem with this PR and
#3121, where we introduce a tool named `kani-cov` to postprocess
coverage results. As explained in #3143, `kani-cov` is expected to be an
alias for the `cov` subcommand and provide most of the postprocessing
features for coverage-related purposes. But, the tool will likely be
introduced after this change. Therefore, we propose to temporarily print
a list of the regions in each function with their associated coverage
status (i.e., `COVERED` or `UNCOVERED`).

### Source-based code coverage: An example

The main advantage of source-based coverage results is their precision
with respect to the source code. The [Source-based Code
Coverage](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html)
documentation explains more details about the LLVM coverage workflow and
its different options.

For example, let's take this Rust code:
```rust
1 fn _other_function() {
2    println!("Hello, world!");
3 }
4
5 fn test_cov(val: u32) -> bool {
6     if val < 3 || val == 42 {
7         true
8     } else {
9         false
10    }
11 }
12
13 #[cfg_attr(kani, kani::proof)]
14 fn main() {
15    let test1 = test_cov(1);
16    let test2 = test_cov(2);
17    assert!(test1);
18    assert!(test2);
19 }
```

Compiling and running the program with `rustc` and the `-C
instrument-coverage` flag, and using the LLVM tools can get us the
following coverage result:


![Image](https://github.com/model-checking/kani/assets/73246657/9070e390-6e0b-4add-828d-d9f9caacad07)


In contrast, the `cargo kani --coverage -Zsource-coverage` command
currently generates:

```
src/main.rs (main)
 * 14:1 - 19:2 COVERED

src/main.rs (test_cov)
 * 5:1 - 6:15 COVERED
 * 6:19 - 6:28 UNCOVERED
 * 7:9 - 7:13 COVERED
 * 9:9 - 9:14 UNCOVERED
 * 11:1 - 11:2 COVERED
```

which is a verification-based coverage result almost equivalent to the
runtime coverage results.

### Benchmarking

We have evaluated the performance impact of the instrumentation using
the `kani-perf.sh` suite (14 benchmarks). For each test, we compare the
average time to run standard verification against the average time to
run verification with the source-based code coverage feature
enabled[^note-line-evaluation].

The evaluation has been performed on an EC2 `m5a.4xlarge` instance
running Ubuntu 22.04. The experimental data has been obtained by running
the `kani-perf.sh` script 10 times for each version (`only verification`
and `verification + coverage`), computing the average and standard
deviation. We've split this data into `small` (tests taking 60s or less)
and `large` (tests taking more than 60s) and drawn the two graphs below.

#### Performance comparison - `small` benchmarks


![performance_comparison_small](https://github.com/user-attachments/assets/679cf412-0193-4b0c-a78c-2d0fb702706f)

#### Performance comparison - `large` benchmarks


![performance_comparison_large](https://github.com/user-attachments/assets/4bb5a895-7f57-49e0-86b5-5fea67fad939)

#### Comments on performance

Looking at the small tests, the performance impact seems negligible in
such cases. The difference is more noticeable in the large tests, where
the time to run verification and coverage can take 2x or even more. It
wouldn't be surprising that, as programs become larger, the complexity
of the coverage checking grows exponentially as well. However, since
most verification jobs don't take longer than 30min (1800s), it's OK to
say that coverage checking represents a 100-200% slowdown in the worst
case w.r.t. standard verification.

It's also worth noting a few other things:
* The standard deviation remains similar in most cases, meaning that the
coverage feature doesn't have an impact on their stability.
* We haven't tried any SAT solvers other than the ones used by default
for each benchmark. It's possible that other solvers perform
better/worse with the coverage feature enabled.

### Call-outs
 * The soundness issue documented in #3441.
* The issue with saving coverage mappings for non-reachable functions
documented in #3445.
* I've modified the test cases in `tests/coverage/` to test this
feature. Since this technique is simpler, we don't need that many test
cases. However, it's possible I've left some test cases which don't
contribute much. Please let me know if you want to add/remove a test
case.

[^note-internal]: The coverage mappings can't be accessed through the
StableMIR interface so we retrieve them through the internal API.

[^note-instrument]: The instrumentation replaces certain counters with
expressions based on other counters when possible to avoid a part of the
runtime overhead. More details can be found
[here](https://github.com/rust-lang/rustc-dev-guide/blob/master/src/llvm-coverage-instrumentation.md#mir-pass-instrumentcoverage).
Unfortunately, we can't avoid instrumenting expressions at the moment.

[^note-line-evaluation]: We have not compared performance against the
line-based code coverage feature because it doesn't seem worth it. The
line-based coverage feature is guaranteed to include more coverage
checks than the source-based one for any function. In addition,
source-based results are more precise than line-based ones. So this
change represents both a quantitative and qualitative improvement.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: No status
Status: Done
Development

Successfully merging this pull request may close these issues.

Coverage option that does not require cbmc-viewer Add test coverage for coverage information
5 participants