Implement coverage reports for Starlark #15594

fmeum · 2022-05-30T13:30:09Z

If the new flag --starlark_coverage_report is set to a non-empty value, the Starlark coverage recorder will be enabled for all .bzl files under the output base executed during the Blaze command. At the end of the command, the collected coverage data will be emitted to the file path given as the flag argument in LCOV format, containing function, line, and branch coverage information.

fmeum · 2022-05-30T13:32:50Z

CC @Wyverald

I generated reports for some rulesets and found the results to be relatively accurate, but this would definitely benefit from more testing (it doesn't come with test cases yet) and polish. I'm also not sure whether it works for repository rules and/or functions executed during the execution phase, e.g. map_each callbacks. I wanted to open a PR to get early feedback and will happily fix bugs and supply tests if this is generally viewed favorably.

fmeum · 2022-05-30T14:04:40Z

src/main/java/net/starlark/java/eval/Eval.java

@@ -131,6 +133,7 @@ private static TokenKind execFor(StarlarkThread.Frame fr, ForStatement node)
            continue;
          case BREAK:
            // Finish loop, execute next statement after loop.
+            CoverageRecorder.getInstance().recordVirtualJump(node);


I don't have any data to assess whether this causes a performance degradation even if coverage is not being recorded (which means that the instance returned will have a no-op recordVirtualJump). If this should turn out to be prohibitive, I could make the instance static final and populate it e.g. via an environment variable rather than a Bazel flag.

Wyverald · 2022-05-30T14:09:45Z

Thanks for the PR, Fabian! I'll leave it to Jon & Sasha to review this. (NB: Jon is OOO until 9 June.)

Also cc @adonovan

fmeum · 2022-05-31T08:35:36Z

I fixed some bugs and refactored the branch logic, but will not make further changes.

sluongng

I definitely am not familiar with Google Java Style guide or the readability process, but here are some nits I have after reading through the PR. Please take them with a grain of salt 🙇‍♂️

Thanks for working on this 🙏 I can see how this would be very useful for my debugging process.

src/main/java/net/starlark/java/syntax/Parameter.java

src/main/java/net/starlark/java/syntax/CoverageRecorder.java

src/main/java/net/starlark/java/eval/Eval.java

src/main/java/net/starlark/java/syntax/NodePrinter.java

src/main/java/net/starlark/java/syntax/CoverageRecorder.java

sluongng · 2022-05-31T10:06:40Z

src/main/java/com/google/devtools/build/lib/runtime/CommonCommandOptions.java

+      defaultValue = "",
+      documentationCategory = OptionDocumentationCategory.LOGGING,
+      effectTags = {OptionEffectTag.BAZEL_MONITORING},
+      help = "Writes into the specified file an lcov coverage report for all Starlark files.")


From a discussion on Slack: It's not immediately clear to end-users that we are using this flag to report coverage of starlark during execution of a bazel command, similar to --starlark_cpu_profile.

My initial impression was that this, similar to other coverage flags, are used with bazel coverage when testing starlark rules with Skylib's unittest.bzl. This is NOT the case.

Might worth being a bit more specific in this description so that Bazel's end users are clear on intended usages.

I reworded this a little to hopefully clarify the scope. I would be interested in integrating a less low-level version of this with bazel coverage, but don't really know how to accomplish this yet.

adonovan

Thanks for sending this, but a change this big needs a design discussion before code. What problem are we trying to solve? What design do you propose? How can we make it simpler?

My initial questions are:

Does this make the interpreter slower? A lot of work has gone into making it as fast as it is (although there is still room for improvement). Benchmarking is time consuming because the JVM is noisy---execution time may vary by 5% or more---making it hard to find regressions down at the 1-2% level.
Do we need all this new machinery? In principle, a coverage tool could be implemented as a debugger that simply steps through the whole program one instruction at a time recording the current pc. We already have a primitive debugger. Can its hooks be used to record coverage, with minimal changes?
Do we need branch coverage? I notice the tool records control jumps such as break statements and the short-circuit in if true || true {, and information about individual parameters. Is this necessary? Statement coverage is good enough for most purposes and is less invasive.

adonovan · 2022-05-31T19:05:08Z

src/main/java/net/starlark/java/syntax/NodePrinter.java

@@ -14,6 +14,7 @@
 package net.starlark.java.syntax;

 import java.util.List;
+import net.starlark.java.syntax.Comprehension.Clause;


Please don't import nested classes; always refer to Comprehension.Clause outside that class.

src/main/java/net/starlark/java/syntax/NodePrinter.java

adonovan · 2022-05-31T19:06:53Z

src/main/java/net/starlark/java/syntax/NodeVisitor.java

@@ -174,9 +174,7 @@ public void visit(@SuppressWarnings("unused") Comment node) {}
  public void visit(ConditionalExpression node) {
    visit(node.getCondition());
    visit(node.getThenCase());
-    if (node.getElseCase() != null) {


Why is this change safe?

ConditionalExpression#getElseCase() is not annotated as @Nullable, which is correct since the lexer only accepts a ConditionalStatement if it has an else.

adonovan · 2022-05-31T19:12:24Z

src/main/java/net/starlark/java/syntax/CoverageVisitor.java

+import java.util.stream.Collectors;
+import javax.annotation.Nullable;
+
+abstract class CoverageVisitor extends NodeVisitor {


This class doesn't belong in the syntax package: it's a client of that package, and it seems like part of the interpreter (eval). It appears there is only a single actual class that implements this abstraction, so I suggest you merge them together.

I agree it doesn't belong here conceptually, but I didn't find a way to move it elsewhere:

It is actually extended twice, once by CoverageNodeVisitor and once anonymously when registering a Program. These two usages are also why I extracted the class in the first place: It is important that the coverage registration and the coverage report generation pass stay in sync.

All the new classes live in syntax since they have to be used in Program#compileFile. I searched for another place to hook into Program creation, but Program#compileFile is called directly from many places.

A `ConditionalExpression`, differing from an `IfStatement`, always has an `else` case.

The generated coverage report includes function, line, and branch coverage information in LCOV format.

If the new flag `--starlark_coverage_report` is set to a non-empty value, the Starlark coverage recorder will be enabled for all `.bzl` files under the output base executed during the Blaze command. At the end of the command, the collected coverage data will be emitted to the file path given as the flag argument in LCOV format.

fmeum · 2022-06-02T12:48:09Z

Thanks for sending this, but a change this big needs a design discussion before code. What problem are we trying to solve? What design do you propose? How can we make it simpler?

Sorry, I certainly didn't intend to skip proper process here. I wrote this code out of a concrete need to better understand the edge cases of rulesets I maintain and to improve my understanding of the Java Starlark implementation. I just thought about sharing it hoping that it could be useful for others. I am completely open to rewriting this from scratch or scrapping it entirely, depending on what the outcome of the design discussion is.

My initial questions are:

Does this make the interpreter slower? A lot of work has gone into making it as fast as it is (although there is still room for improvement). Benchmarking is time consuming because the JVM is noisy---execution time may vary by 5% or more---making it hard to find regressions down at the 1-2% level.

The way the hooks are currently implemented, I won't be able to dismiss this concern with confidence. As you said, benchmarking doesn't necessarily catch all regressions on such a low level.

I do think that there is a way to implement the coverage hooks without any overhead in the case where coverage isn't being collected:

Make the static CoverageRecorder instance final and populate it based on e.g. a system property starlark.lcov_report_file being set to a non-empty value.
Make --starlark_coverage_profile a startup option that sets this system property for the Bazel server JVM.
Add logic that dumps the report on server shutdown.

The static final field should be inlined by the JIT - if we want to be certain, we could also make it a simple boolean that triggers a coverage slow path if true. I could then double-check that the code emitted by the JIT doesn't change with the PR merged.

Do you like this approach better?

Do we need all this new machinery? In principle, a coverage tool could be implemented as a debugger that simply steps through the whole program one instruction at a time recording the current pc. We already have a primitive debugger. Can its hooks be used to record coverage, with minimal changes?

After a first read of the code, I'm inclined to say this would work for statement coverage, but not for more fine-grained notions of coverage.

Do we need branch coverage? I notice the tool records control jumps such as break statements and the short-circuit in if true || true {, and information about individual parameters. Is this necessary? Statement coverage is good enough for most purposes and is less invasive.

I agree that statement coverage would already be quite useful. However, given the general conciseness of Starlark (especially with list/dict comprehensions and conditional expressions), a single statement can encode rather complex logic. With the rulesets I have tried this on, statement coverage has almost always been nearing 100%, but branch coverage revealed some interesting uncovered cases (e.g., handling of workspace vs. external repository, checks for missing/superfluous files). While branch coverage information for breaks may not be as useful, I found it better to cover all branches rather than just some of them.

Assuming we find a way to ensure that coverage support introduces no overhead at all (as possibly described above), would you still prefer to remove some or all of the branch coverage hooks?.

fmeum · 2022-07-17T16:25:29Z

@adonovan Would you like me to prototype the static final approach and take a look at the JIT output? I think that this has a fair chance of alleviating the performance worries entirely.

brandjon · 2022-12-19T18:17:47Z

Finally taking a look at this, as I promised Fabian back at BazelCon.

It looks like there's quite a bit more code to this PR than I originally thought. I'm interested in supporting more tooling for Starlark in the new year, coverage included, but in light of

the added code complexity (who will maintain all this extra logic?),
concerns of performance regression, and
uncertainty about the extent of the benefit of this feature to users, and the relative benefits of expression granularity vs statement granularity,

it's clear that a quick acceptance of this PR is off the table. I'd love to have a more involved discussion centered around the benefit vs maintainability tradeoff in Q1. I'll keep the PR open for now.

fmeum · 2024-03-05T12:48:34Z

@brandjon @tetromino Friendly ping on this.

I could trim this down to just line coverage and ensure that there is no runtime overhead if coverage is disabled.

fmeum marked this pull request as ready for review May 30, 2022 13:36

fmeum requested review from brandjon and tetromino as code owners May 30, 2022 13:36

fmeum commented May 30, 2022

View reviewed changes

sgowroji added team-Starlark-Integration Issues involving Bazel's integration with Starlark, excluding builtin symbols awaiting-review PR is awaiting review from an assigned reviewer labels May 31, 2022

fmeum force-pushed the starlark-coverage branch 2 times, most recently from 24f1b2f to bce3f0d Compare May 31, 2022 07:09

fmeum marked this pull request as draft May 31, 2022 08:03

fmeum marked this pull request as ready for review May 31, 2022 08:34

fmeum force-pushed the starlark-coverage branch from bce3f0d to 30765f5 Compare May 31, 2022 08:34

Fix NPE in Parameter.Star#getLocation() for unnamed vararg

edc799e

sluongng reviewed May 31, 2022

View reviewed changes

fmeum force-pushed the starlark-coverage branch 4 times, most recently from 4e8e8ed to 0d2dcca Compare May 31, 2022 10:24

adonovan reviewed May 31, 2022

View reviewed changes

fmeum added 4 commits June 1, 2022 15:51

Fix NodePrinter crash on Comprehension.Clause

4d1fe9b

Remove redundant null check in NodeVisitor

fc580d7

A `ConditionalExpression`, differing from an `IfStatement`, always has an `else` case.

Add an LCOV coverage report generator for Starlark

90607e1

The generated coverage report includes function, line, and branch coverage information in LCOV format.

fmeum force-pushed the starlark-coverage branch from 0d2dcca to b603de0 Compare June 2, 2022 08:30

fmeum mentioned this pull request Aug 25, 2022

Optimise repository rule generation with re-entrancy #16162

Closed

brandjon added team-Starlark-Interpreter Issues involving the Starlark interpreter used by Bazel and removed team-Starlark-Integration Issues involving Bazel's integration with Starlark, excluding builtin symbols awaiting-review PR is awaiting review from an assigned reviewer labels Dec 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement coverage reports for Starlark #15594

Implement coverage reports for Starlark #15594

fmeum commented May 30, 2022 •

edited

Loading

fmeum commented May 30, 2022 •

edited

Loading

fmeum May 30, 2022

Wyverald commented May 30, 2022

fmeum commented May 31, 2022 •

edited

Loading

sluongng left a comment

sluongng May 31, 2022

fmeum May 31, 2022

adonovan left a comment •

edited

Loading

adonovan May 31, 2022

fmeum Jun 2, 2022

adonovan May 31, 2022

fmeum Jun 2, 2022

adonovan May 31, 2022

fmeum Jun 2, 2022

fmeum commented Jun 2, 2022

fmeum commented Jul 17, 2022

brandjon commented Dec 19, 2022

fmeum commented Mar 5, 2024

Implement coverage reports for Starlark #15594

Are you sure you want to change the base?

Implement coverage reports for Starlark #15594

Conversation

fmeum commented May 30, 2022 • edited Loading

fmeum commented May 30, 2022 • edited Loading

Choose a reason for hiding this comment

Wyverald commented May 30, 2022

fmeum commented May 31, 2022 • edited Loading

sluongng left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adonovan left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fmeum commented Jun 2, 2022

fmeum commented Jul 17, 2022

brandjon commented Dec 19, 2022

fmeum commented Mar 5, 2024

fmeum commented May 30, 2022 •

edited

Loading

fmeum commented May 30, 2022 •

edited

Loading

fmeum commented May 31, 2022 •

edited

Loading

adonovan left a comment •

edited

Loading