-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement coverage reports for Starlark #15594
base: master
Are you sure you want to change the base?
Conversation
CC @Wyverald I generated reports for some rulesets and found the results to be relatively accurate, but this would definitely benefit from more testing (it doesn't come with test cases yet) and polish. I'm also not sure whether it works for repository rules and/or functions executed during the execution phase, e.g. map_each callbacks. I wanted to open a PR to get early feedback and will happily fix bugs and supply tests if this is generally viewed favorably. |
@@ -131,6 +133,7 @@ private static TokenKind execFor(StarlarkThread.Frame fr, ForStatement node) | |||
continue; | |||
case BREAK: | |||
// Finish loop, execute next statement after loop. | |||
CoverageRecorder.getInstance().recordVirtualJump(node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any data to assess whether this causes a performance degradation even if coverage is not being recorded (which means that the instance returned will have a no-op recordVirtualJump
). If this should turn out to be prohibitive, I could make the instance static final
and populate it e.g. via an environment variable rather than a Bazel flag.
Thanks for the PR, Fabian! I'll leave it to Jon & Sasha to review this. (NB: Jon is OOO until 9 June.) Also cc @adonovan |
24f1b2f
to
bce3f0d
Compare
I fixed some bugs and refactored the branch logic, but will not make further changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely am not familiar with Google Java Style guide or the readability process, but here are some nits I have after reading through the PR. Please take them with a grain of salt 🙇♂️
Thanks for working on this 🙏 I can see how this would be very useful for my debugging process.
defaultValue = "", | ||
documentationCategory = OptionDocumentationCategory.LOGGING, | ||
effectTags = {OptionEffectTag.BAZEL_MONITORING}, | ||
help = "Writes into the specified file an lcov coverage report for all Starlark files.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a discussion on Slack: It's not immediately clear to end-users that we are using this flag to report coverage of starlark during execution of a bazel command, similar to --starlark_cpu_profile
.
My initial impression was that this, similar to other coverage flags, are used with bazel coverage
when testing starlark rules with Skylib's unittest.bzl
. This is NOT the case.
Might worth being a bit more specific in this description so that Bazel's end users are clear on intended usages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworded this a little to hopefully clarify the scope. I would be interested in integrating a less low-level version of this with bazel coverage
, but don't really know how to accomplish this yet.
4e8e8ed
to
0d2dcca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sending this, but a change this big needs a design discussion before code. What problem are we trying to solve? What design do you propose? How can we make it simpler?
My initial questions are:
- Does this make the interpreter slower? A lot of work has gone into making it as fast as it is (although there is still room for improvement). Benchmarking is time consuming because the JVM is noisy---execution time may vary by 5% or more---making it hard to find regressions down at the 1-2% level.
- Do we need all this new machinery? In principle, a coverage tool could be implemented as a debugger that simply steps through the whole program one instruction at a time recording the current pc. We already have a primitive debugger. Can its hooks be used to record coverage, with minimal changes?
- Do we need branch coverage? I notice the tool records control jumps such as break statements and the short-circuit in
if true || true {
, and information about individual parameters. Is this necessary? Statement coverage is good enough for most purposes and is less invasive.
@@ -14,6 +14,7 @@ | |||
package net.starlark.java.syntax; | |||
|
|||
import java.util.List; | |||
import net.starlark.java.syntax.Comprehension.Clause; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't import nested classes; always refer to Comprehension.Clause outside that class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
@@ -174,9 +174,7 @@ public void visit(@SuppressWarnings("unused") Comment node) {} | |||
public void visit(ConditionalExpression node) { | |||
visit(node.getCondition()); | |||
visit(node.getThenCase()); | |||
if (node.getElseCase() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConditionalExpression#getElseCase()
is not annotated as @Nullable
, which is correct since the lexer only accepts a ConditionalStatement
if it has an else
.
import java.util.stream.Collectors; | ||
import javax.annotation.Nullable; | ||
|
||
abstract class CoverageVisitor extends NodeVisitor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class doesn't belong in the syntax package: it's a client of that package, and it seems like part of the interpreter (eval). It appears there is only a single actual class that implements this abstraction, so I suggest you merge them together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it doesn't belong here conceptually, but I didn't find a way to move it elsewhere:
- It is actually extended twice, once by
CoverageNodeVisitor
and once anonymously when registering aProgram
. These two usages are also why I extracted the class in the first place: It is important that the coverage registration and the coverage report generation pass stay in sync. - All the new classes live in
syntax
since they have to be used inProgram#compileFile
. I searched for another place to hook intoProgram
creation, butProgram#compileFile
is called directly from many places.
A `ConditionalExpression`, differing from an `IfStatement`, always has an `else` case.
The generated coverage report includes function, line, and branch coverage information in LCOV format.
If the new flag `--starlark_coverage_report` is set to a non-empty value, the Starlark coverage recorder will be enabled for all `.bzl` files under the output base executed during the Blaze command. At the end of the command, the collected coverage data will be emitted to the file path given as the flag argument in LCOV format.
Sorry, I certainly didn't intend to skip proper process here. I wrote this code out of a concrete need to better understand the edge cases of rulesets I maintain and to improve my understanding of the Java Starlark implementation. I just thought about sharing it hoping that it could be useful for others. I am completely open to rewriting this from scratch or scrapping it entirely, depending on what the outcome of the design discussion is.
The way the hooks are currently implemented, I won't be able to dismiss this concern with confidence. As you said, benchmarking doesn't necessarily catch all regressions on such a low level. I do think that there is a way to implement the coverage hooks without any overhead in the case where coverage isn't being collected:
The Do you like this approach better?
After a first read of the code, I'm inclined to say this would work for statement coverage, but not for more fine-grained notions of coverage.
I agree that statement coverage would already be quite useful. However, given the general conciseness of Starlark (especially with list/dict comprehensions and conditional expressions), a single statement can encode rather complex logic. With the rulesets I have tried this on, statement coverage has almost always been nearing 100%, but branch coverage revealed some interesting uncovered cases (e.g., handling of workspace vs. external repository, checks for missing/superfluous files). While branch coverage information for Assuming we find a way to ensure that coverage support introduces no overhead at all (as possibly described above), would you still prefer to remove some or all of the branch coverage hooks?. |
@adonovan Would you like me to prototype the |
Finally taking a look at this, as I promised Fabian back at BazelCon. It looks like there's quite a bit more code to this PR than I originally thought. I'm interested in supporting more tooling for Starlark in the new year, coverage included, but in light of
it's clear that a quick acceptance of this PR is off the table. I'd love to have a more involved discussion centered around the benefit vs maintainability tradeoff in Q1. I'll keep the PR open for now. |
@brandjon @tetromino Friendly ping on this. I could trim this down to just line coverage and ensure that there is no runtime overhead if coverage is disabled. |
If the new flag
--starlark_coverage_report
is set to a non-empty value, the Starlark coverage recorder will be enabled for all.bzl
files under the output base executed during the Blaze command. At the end of the command, the collected coverage data will be emitted to the file path given as the flag argument in LCOV format, containing function, line, and branch coverage information.