MIR InstrumentCoverage - Can spans for TerminatorKind::Goto
be improved to avoid special cases?
#78542
Labels
A-MIR
Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html
Rust's LLVM InstrProf-based source code coverage implementation instruments Rust code via the MIR pass
InstrumentCoverage
. Most criteria for identifying coverage regions and counter locations are very general, based on Control Flow Graph (CFG) analysis of the MIR, and a fairly straightforward mapping of MIRStatement
s andTerminator
s to their source code regions (Span
s).TerminatorKind::Goto
s are an exception, requiring special handling.This issue is created to highlight some of the unique requirements and issues addressed in the current coverage implementation, in case someone has ideas for improving things, to reduce the reliance on the
Goto
-specific logic, either by improvingInstrumentCoverage
if something was overlooked, or improving theGoto
representation (such as refining itsSpan
representation, or providing additional context thatInstrumentCoverage
might leverage).Current State
One of the first steps in the
InstrumentCoverage
process is to extract relevant codeSpan
s from the MIRStatement
s andTerminator
s. (TheseSpan
s are later combined into sets of sequential statements and with contiguous source code regions that can be counted via a single counter; i.e., if any statement in the set was executed, all statements in the same set would also have been executed.)bcb_to_initial_coverage_spans()
iterates through theBasicBlock
s of theCoverageGraph
(a subset of the MIR, essentially skipping panic/unwind paths), and theirStatement
s andTerminator
s. SomeStatement
s andTerminator
s are relevant toCoverage
, and others are not. TheStatement
andTerminator
filtering is handled byfiltered_statement_span()
andfiltered_terminator_span()
, respectively.In almost all cases, if not filtered out, the initial coverage
Span
contributed by either aStatement
or aTerminator
is thesource_info.span
(within the function body) of theStatement
orTerminator
; because, in most cases, the source code span carried forward from the parsed source to its MIR representation is a fairly accurate mapping from intent to execution.For example,
filtered_terminator_span()
uses the entiresource_info.span
for the followingTerminatorKind
s:All other
TerminatorKind
s are filtered out, except forGoto
.Goto
terminators play an important role in the control flow, so they cannot be filtered out, but theirsource_info.span
typically includes theSpan
s of the statements that precede it, making theSpan
redundant, in most cases.Since a
Goto
-basedCoverageSpan
still needs a span to indicate if a region of actual source code was executed or not, the span returned fromfiltered_terminator_span()
, forGoto
s, is an empty span, positioned at theGoto
span's last byte position:This byte position can--most often--be leveraged to contribute to a
CoverageSpan
for certain execution branches.For example, an
if
block without anelse
shows the block was executed if the condition wastrue
, but there would be no way to indicate coverage (or lack thereof) of thefalse
branch without using the associatedGoto
shi()
byte position (which is expanded by one character to the left, for a non-emptyCoverageSpan
.However, in other cases, a visible
CoverageSpan
is not wanted, but theGoto
block must still be counted (for example, to contribute its count to an
Expression
that reports the execution count for some other block). In these cases, the code region
is set to
None
.This decision (whether to include a one-character coverage span for a
Goto
or to count aGoto
block without a code region) is handled ininject_coverage_span_counters()
, beginning with the call tois_code_region_redundant()
, which encapsulates the decision on how to handle these special cases.At the time of this writing, the decision criteria is only looking for
Goto
terminators with spans that end at the last byte position in the file, because theseGoto
spans--if present--are redundant with the spans from every function's finalReturn
terminator. When they are present, they can cause the function's last line to appear to have been executed twice, when it was only executed once.The text was updated successfully, but these errors were encountered: