Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lcov reports should not use line number zero in BRDA: records #1846

Closed
zackw opened this issue Sep 4, 2024 · 16 comments · Fixed by #1849
Closed

lcov reports should not use line number zero in BRDA: records #1846

zackw opened this issue Sep 4, 2024 · 16 comments · Fixed by #1849
Labels
bug Something isn't working

Comments

@zackw
Copy link
Contributor

zackw commented Sep 4, 2024

Describe the bug

coverage lcov can generate BRDA: records with line number zero. Current versions of lcov's genhtml reject such lines:

genhtml: ERROR: line 0 of .../sourcefile.py has branchcov but no linecov data

To Reproduce

The bad BRDA records are coming from here:

# The exit branches have a negative line number,
# this will not produce valid lcov. Setting
# the line number of the exit branch to 0 will allow
# for valid lcov, while preserving the data.
line_number = max(line_number, 0)

I do not understand how to hit this case. The commentary, and the description of what negative numbers mean in arc records (https://coverage.readthedocs.io/en/7.6.1/api_coveragedata.html#coverage.CoverageData.arc) make it sound like unexecuted conditional return statements could cause the problem. Running coverage run --branch test.py on this file

def f(x):
  if x < 0:
    return
  print("got here")

f(0)
f(1)

produces a .coverage database whose arc table does contain negative numbers, but running coverage lcov on that database produces

BRDA:3,0,0,-
BRDA:4,0,1,1
BRF:2
BRH:1

which does not use line zero and looks correct to me (well, modulo the fact that a human would say there isn't a branch on line 4? but that's a separate issue).

Looking at the database for the real program that produced the actual bad lcov-format report that prompted this issue, I see lots more cases of negative numbers in the arc table, but the associated positive numbers are too large to be actual line numbers in the associated file, which means I don't actually understand what the records in the arc table mean, so I'm stuck.

Expected behavior

coverage lcov should generate a .lcov file that genhtml does not reject.

@zackw zackw added the bug Something isn't working label Sep 4, 2024
@nedbat
Copy link
Owner

nedbat commented Sep 4, 2024

Looking at the database for the real program that prompted me to file this issue, I see lots more cases of negative numbers in the arc table, but the associated positive numbers are too large to be actual line numbers in the associated file, which means I don't actually understand what the records in the arc table mean, so I'm stuck.

Can you share that project?

@zackw
Copy link
Contributor Author

zackw commented Sep 4, 2024

Sure: https://github.com/MillionConcepts/pdr

Using that, a repro recipe should be

  • create a blank venv
  • pip install pytest-cov
  • pip install -e '.[browsify,fits,pvl,tiff]'
  • pytest --cov --cov-branch --import-mode=importlib
  • coverage lcov

@zackw
Copy link
Contributor Author

zackw commented Sep 4, 2024

If I understand correctly what BRDA: records mean (which I am about 90% sure of), the line number is supposed to be the line where the branch actually is, and the block and branch numbers are supposed to be, together, a unique but meaningless identifier for the branch's destination. Based on this, arc records with negative fromno shouldn't generate BRDA: records at all, and you can just make up block and branch numbers for all the different tonos for any given positive fromno (it seems to be fine to use zero in those fields).

Also, - in the taken field is supposed to happen only when that branch was never reached, i.e. the line containing the branch wasn't executed at all. It should be zero for branches where the conditional statement did execute but the branch was not taken.

@nedbat
Copy link
Owner

nedbat commented Sep 5, 2024

Thanks. Can you point me to specific entries in the .coverage file that have nonsensical entries in the arcs table? BTW, I'm using Python 3.11. Your setup.py says 3.9 should work, but it fails when trying to import inspect.get_annotations.

@zackw
Copy link
Contributor Author

zackw commented Sep 5, 2024 via email

@nedbat
Copy link
Owner

nedbat commented Sep 5, 2024

Sorry, you said:

Looking at the database for the real program that produced the actual bad lcov-format report that prompted this issue, I see lots more cases of negative numbers in the arc table, but the associated positive numbers are too large to be actual line numbers in the associated file...

I'm curious what entries in the arc table seem wrong to you?

@zackw
Copy link
Contributor Author

zackw commented Sep 6, 2024

Ah, I misunderstood what you meant by "nonsensical entries in the arcs table".

I'm attaching a SQL dump of the coverage database that I'm looking at. I have added a max_lineno column to the file table, giving the maximum line number (as measured by wc -l) for each file. Using that, here are the arc-table entries that seem to violate the specification:

sqlite> select file.path, file.max_lineno, arc.fromno, arc.tono
          from arc, file
         where arc.file_id = file.id
           and (arc.fromno > file.max_lineno
             or arc.tono > file.max_lineno);
path                     max_lineno  fromno  tono
-----------------------  ----------  ------  ----
pdr/pdr.py               968         964     971
pdr/pdr.py               968         967     969
pdr/pdr.py               968         969     -964
pdr/pdr.py               968         971     -959
pdr/loaders/dispatch.py  137         130     138
pdr/loaders/dispatch.py  137         138     -1

I don't know if this has anything to do with the BRDA:0 issue.

It is also not clear to me what negative numbers other than -1 mean.

This database was generated from git commit 092675e442438ddc39d4bb9e9d0d5f4c1e317f29. The .coveragerc was slightly modified from what's checked in:

[run]
source_pkgs = pdr
omit =
	*/six_new.py
	*/tkinter_stub.py

coverage+linemax.sql.gz

@zackw
Copy link
Contributor Author

zackw commented Sep 6, 2024

Another piece of information that might be useful to you: here is a comparison between the output of coverage lcov (A.lcov in the diff) and the result of running the xml2lcov utility (from the lcov project) on the output of coverage xml (B.lcov in the diff), after normalizing both files for comparison.

"Normalizing both files for comparison" means:

  • all TN: lines were stripped
  • checksums were removed from all DA: lines
  • records were sorted by SF: pathname
  • within each record, the sequence of sub-record types was canonicalized to DA, LF, LH, BRDA, BRF, BRH
  • DA and BRDA lines were sorted numerically by line number
  • vacuous BRF:0 BRH:0 and LF:0 LH:0 pairs were removed

Abstractly, every difference shown in this attachment represents a bug in one of the three tools used, but I do not know which one. For purpose of this bug report, however, please note that the BRDA records are consistently totally different. There are occasional differences in DA records as well.

lcov-gen-comparison.diff.gz

@zackw
Copy link
Contributor Author

zackw commented Sep 6, 2024

A couple of times earlier I said "GCDA" when I meant "BRDA". Sorry. No idea where GC came from.

@nedbat
Copy link
Owner

nedbat commented Sep 6, 2024

I'm coming to the unfortunate conclusion that the coverage.py lcov output is simply wrong with respect to branches. It produces BRDA lines that name the destination line, but should name the source line. The lcov format is poorly documented. This seems to be the best we have: https://manpages.debian.org/stretch/lcov/geninfo.1.en.html . It says:

Branch coverage information is stored which one line per branch:

BRDA:<line number>,<block number>,<branch number>,<taken>

Block number and branch number are gcc internal IDs for the branch. Taken is either '-' if the basic block containing the branch was never executed or a number indicating how often that branch was taken.

"which" is supposed to be "with" I guess? And line number isn't clarified. The xml2lcov tool reports the source line number, which makes sense. Changing to that interpretation will solve the 0 line number problem.

@bradb423 What do you think?

zackw added a commit to MillionConcepts/coveragepy-fixes that referenced this issue Sep 6, 2024
…rts.

This fixes five serious bugs:

- The first field of a BRDA: line may not be zero (nedbat#1846)
- The first field of a BRDA: line is supposed to be the *source* line of a
  branch, not the destination line.
- The fourth field of a BRDA: line is supposed to be “-” when the branch
  was *never reached*, not when it was reached but never taken (which is
  what a branch’s presence in missing_arcs means).  As far as I can tell,
  coverage.py currently doesn’t know of the existence of branches that were
  never reached.

- The decision of whether to emit a DA: line at all is now taken according to
  what’s in analysis.statements, not what’s in analysis.executed and
  analysis.missing; this is important because some lines (e.g. the beginnings
  of docstrings) may appear in analysis.executed but *not* in
  analysis.statements.

- We no longer attempt to call branch-coverage-related Analysis methods
  when analysis.has_arcs is false.

And two minor annoyances:

- DA: and BRDA: lines are now emitted strictly in ascending order by (source)
  line number.
- Source file records are now sorted by *relative* pathname, not absolute
  pathname from the coverage database.
zackw added a commit to MillionConcepts/coveragepy-fixes that referenced this issue Sep 9, 2024
…rts.

This fixes five serious bugs:

- The first field of a BRDA: line may not be zero (nedbat#1846).
- The first field of a BRDA: line is supposed to be the *source* line of each
  instrumented branch, not the destination line.
- The fourth field of a BRDA: line is supposed to be “-” when the branch
  was *never reached*, not when it was reached but never/always taken (which
  is what a branch’s presence in missing_arcs means).  As far as I can tell,
  coverage.py currently doesn’t know of the existence of branches that were
  never reached.

- The decision of whether to emit DA: and BRDA: lines at all is now taken
  strictly according to what’s in analysis.statements.  This is important
  because some lines may appear in analysis.executed and/or
  analysis.executed_branch_arcs but *not* in analysis.statements.
  For example, the beginnings of docstrings are like this, as is the
  phantom line 1 of an empty __init__.py in Python 3.10 and earlier.

  (I am pleased to note that the special casing of empty __init__.py in
  the test suite is no longer required after this patch.)

- We no longer attempt to call branch-coverage-related Analysis methods
  when analysis.has_arcs is false.

And two minor annoyances:

- DA: and BRDA: lines are now emitted strictly in ascending order by (source)
  line number.
- Source file records are now sorted by *relative* pathname, not absolute
  pathname from the coverage database.
nedbat pushed a commit that referenced this issue Sep 11, 2024
* [lcov] Style improvements to lcovreport.py

- Rename LcovReporter.get_lcov to LcovReporter.lcov_file because the
  old name sounds like it returns something.  lcov_file was selected
  by analogy with XmlReporter.xml_file.

- Move the increment of self.total from lcov_file to the loop in
  LcovReporter.report for better separation of concerns.

* [lcov] honor skip_empty; improve consistency with xml2lcov

Implement skip_empty mode in LcovReporter.

Also, improve consistency with what you get by running ‘coverage xml’
and then ‘xml2lcov’ by not emitting any vacuous TN: lines, not emitting
LF:0 LH:0 for empty files, and not emitting BRF:0 BRH:0 for files with
no branches.

* [lcov] don’t write DA-line checksums by default

DA-line checksums bulk up the .lcov file and provide only a weak assurance
that the source file being processed for analysis matches the source file
that was used to generate coverage data.  Current versions of the LCOV suite
discourage their use.

Add a boolean configuration option, lcov.line_checksums, which controls
whether checksums are generated for DA lines.  Consistent with the current
behavior of the LCOV suite, the default is to not generate any checksums.

* [lcov] Re-implement lcov reports using the same algorithm as XML reports.

This fixes five serious bugs:

- The first field of a BRDA: line may not be zero (#1846).
- The first field of a BRDA: line is supposed to be the *source* line of each
  instrumented branch, not the destination line.
- The fourth field of a BRDA: line is supposed to be “-” when the branch
  was *never reached*, not when it was reached but never/always taken (which
  is what a branch’s presence in missing_arcs means).  As far as I can tell,
  coverage.py currently doesn’t know of the existence of branches that were
  never reached.

- The decision of whether to emit DA: and BRDA: lines at all is now taken
  strictly according to what’s in analysis.statements.  This is important
  because some lines may appear in analysis.executed and/or
  analysis.executed_branch_arcs but *not* in analysis.statements.
  For example, the beginnings of docstrings are like this, as is the
  phantom line 1 of an empty __init__.py in Python 3.10 and earlier.

  (I am pleased to note that the special casing of empty __init__.py in
  the test suite is no longer required after this patch.)

- We no longer attempt to call branch-coverage-related Analysis methods
  when analysis.has_arcs is false.

And two minor annoyances:

- DA: and BRDA: lines are now emitted strictly in ascending order by (source)
  line number.
- Source file records are now sorted by *relative* pathname, not absolute
  pathname from the coverage database.

---------

Co-authored-by: Zack Weinberg <zack@owlfolio.org>
@henry2cox
Copy link

I'm coming to the unfortunate conclusion that the coverage.py lcov output is simply wrong with respect to branches. It produces BRDA lines that name the destination line, but should name the source line. The lcov format is poorly documented. This seems to be the best we have: https://manpages.debian.org/stretch/lcov/geninfo.1.en.html .

Likely, the best lcov coverage data format description is at https://github.com/linux-test-project/lcov/blob/master/man/geninfo.1, in the "TRACEFILE FORMAT" section.
This turns into a man page. Whether particular website holds an up-to-date copy - is up to that website's maintainer.
Generally, though: when the format is changed/enhanced, I attempt to maintain backward compatibility.

It says:

Branch coverage information is stored which one line per branch:
BRDA:,,,
Block number and branch number are gcc internal IDs for the branch. Taken is either '-' if the basic block containing the branch was never executed or a number indicating how often that branch was taken.

A more up-to-date version would note that the 'branch number' can be .. anything. In particular, it may be a human-readable expression which would help a user understand the expression truth table.
A possibly good way to envision this is to look at the perl2lcov result - say, from the perl2lcov regression tests in the lcov module.

The 'block number' is there so that the compiler/toolchain can accurately describe cases when the same source gives rise to multiple different expressions (for example, with different numbers of branches or with different associated expressions). This is not uncommon when C++ template functions use temple arguments in some branch expression, or (less commonly) when a tool does aggressive inlining.

As noted in the usage message for lcov's xml2lcovutil.py utility (see here:), the current branch expression structure generated by Coverage.py is not ideal because it doesn't tell us anything about which expressions have been sensitized.
For example: we run 2 tests and see each one has exercised 1 of 2 branch expressions on some particular line.
Did they both hit the same expression - so our branch coverage on that particular line is 50% - or did they hit different expressions - so our coverage is 100%? We can't know.
For the same reason, it is generally not possible to look at the branch coverage result to determine what we need to add to our regression suite in order to improve coverage.

"which" is supposed to be "with" I guess? And line number isn't clarified. The xml2lcov tool reports the source line number, which makes sense. Changing to that interpretation will solve the 0 line number problem.

Yes. Typo. Will fix.

@nedbat
Copy link
Owner

nedbat commented Sep 11, 2024

Likely, the best lcov coverage data format description is at https://github.com/linux-test-project/lcov/blob/master/man/geninfo.1, in the "TRACEFILE FORMAT" section.
This turns into a man page. Whether particular website holds an up-to-date copy - is up to that website's maintainer.
Generally, though: when the format is changed/enhanced, I attempt to maintain backward compatibility.

Thanks for this link. It has been hard to find published information about the format. Do you publish this content in a more readable format someplace? It would help us tool authors provide support for the lcov format.

he current branch expression structure generated by Coverage.py is not ideal because it doesn't tell us anything about which expressions have been sensitized.

Yes, coverage.py is fundamentally line oriented for now.

@nedbat
Copy link
Owner

nedbat commented Sep 11, 2024

BTW, I've made a pull request against the lcov.1 file: https://github.com/linux-test-project/lcov/pull/320/files

@zackw
Copy link
Contributor Author

zackw commented Sep 11, 2024

Thanks, this is really helpful. I had completely misunderstood block numbers; I thought they were for disambiguating multiple branch expressions on the same line, not for disambiguating template expansions and the like. Also, I had no idea that the "branch number" is really a "branch destination label" and can be a human-readable string. In my local working copy I now have the coverage lcov report for

something = True

if something:
    print("Yes, something")
else:
    print("No, nothing")

coming out like this:

DA:1,1
DA:3,1
DA:4,1
DA:6,0
LF:4
LH:3
BRDA:3,0,to line 4,1
BRDA:3,0,to line 6,0
BRF:2
BRH:1

And, in turn, genhtml's hover text for the [ + - ] on line 3 is quite a bit more useful.

the current branch expression structure generated by Coverage.py is not ideal because it doesn't tell us anything about which expressions have been sensitized.

Internally, what coverage.py knows about branch expressions is that line N is the source of more than one control flow arc (== 'edge' in most compiler literature), with destination lines X, Y, Z. (There are almost never more than two destinations, but it can happen.) It does not know under what conditions each arc will be taken. So in

def fn(a, b):             # 1
    if a or b:            # 2
       print("a or b")    # 3
    if a and b:           # 4
       print ("a and b")  # 5

fn(True, False)
fn(False, False)

the internal representation knows that line 2 is the source of control flow arcs leading to lines 3 and 4, both of which were executed at least once. It also knows that line 4 is the source of a control flow arc leading to line 4, which was never executed, and another arc that exits the function, which was executed at least once. But it does not know anything about the subexpressions within lines 2 and 4 and so it cannot report that line 4 was never executed because of b rather than a.

A less contrived example, cut down from real code:

def crunch(*args):
    pass

def fn(data):
   if any(("spectrum" in d.lower() for d in data)):  # line 5
     crunch(data)

fn([])
fn(["image"])
fn(["image", "time-series"])

The internal records tell us that control flow went from line 5 to two different "exit" targets (leaving the generator expression and leaving the entire function), and that it could have also gone to line 6 but never did. In my working tree this will be directly reflected in coverage lcov output as

SF:test.py
DA:1,1
DA:2,0
DA:4,1
DA:5,1
DA:6,0
DA:8,1
DA:9,1
DA:10,1
LF:8
LH:6
BRDA:5,0,to line 6,0
BRDA:5,0,to exit 4,1
BRDA:5,0,to exit 5,1
BRF:2
BRH:1
end_of_record

This is not ideal; the distinction between exit 4 and exit 5 only makes sense in terms of the bytecode translation of the function (in which the generator expression turns into a lambda).

The native report generator, coverage html, can tell that the branch that leaves the generator expression is not interesting, and can be quite specific about what happened: line 5 is identified as "partially executed" with an annotation indicating that it could have, but did not, jump to line 6, "because the condition on line 5 was never true".

image

Cobertura-format XML coverage reports impose further loss of information. The relevant bits of coverage xml output for the same Python file will be

<lines>
 <line number="1" hits="1"/>
 <line number="2" hits="0"/>
 <line number="4" hits="1"/>
 <line number="5" hits="1" branch="true"
       condition-coverage="50% (1/2)" missing-branches="6"/>
 <line number="6" hits="0"/>
 <line number="8" hits="1"/>
 <line number="9" hits="1"/>
 <line number="10" hits="1"/>
</lines>

because coverage xml currently does not know how to generate <conditions> records. That could change, but if I was going to implement it, I would need to see actual documentation for <conditions> and <condition> and I haven't been able to find any.

@henry2cox
Copy link

Thanks for this link. It has been hard to find published information about the format. Do you publish this content in a more readable format someplace? It would help us tool authors provide support for the lcov format.

At present: no. It is just in the source code (...definitive) and the man page (...hopefully accurate).
If someone has a suggestion and a pull request - I'm happy to integrate, though.

@henry2cox
Copy link

... I had no idea that the "branch number" is really a "branch destination label" and can be a human-readable string ...

I would characterize it differently...the 'branch number' (or branch expression) is there to tell us how the conditional expression was evaluated.
What lcov is really trying to show you is that expression truth table - however imperfectly.

Consider

if (a || b) {
   bb_1;
else {
  bb_2;
}

The conditional on the first line has 3 possible outcomes:

  • 'a' is true
  • 'a' is false and 'b' is true
  • both 'a' and 'b' are false

In the first two cases, bb_1 is executed (and it's line coverage hit count will be non-zero - and should really be the sum of the number of times that each of those cases was evaluated).
It will take at least 3 calls to reach 100% branch coverage here - but you can reach 100% line coverage with 2.

If the conditional is simple, then branch coverage doesn't tell you anything that you could not deduce from line coverage...but real control code is often pretty complicated, and scenario testing is really important.

The fly in the ointment here is that most tools don't generate a reduced truth table - for example, gcc/gcov will show 4 outcomes for the above example.
As a result: it is often not possible to reach 100% branch coverage - even with heroic effort - as some of the branch expressions are not reachable, and neither the compiler nor lcov can figure that out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants