Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dafny Test Coverage Report #4325

Merged
merged 17 commits into from
Aug 17, 2023
Merged

Conversation

Dargones
Copy link
Collaborator

@Dargones Dargones commented Jul 24, 2023

This PR adds functionality for emitting JaCoCo-style coverage reports for Dafny code. A coverage report is a set of RangeTokens/CoverageLabel pairs, which are rendered like in the image below:

Screenshot 2023-07-24 at 1 56 12 AM

Each report is a time-stamped directory with an index.html file that summarizes the coverage and contains links to all other files:

Screenshot 2023-07-24 at 1 56 00 AM

Because there might be several notions of coverage in Dafny (coverage that Dafny expects automatically generated tests to have, actual runtime coverage on C#, Java, Python, etc.), I add a new command, called merge-coverage-reports that takes in several existing coverage reports and links them together so that it is easy to cross-reference them (see images above).

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

@Dargones Dargones marked this pull request as ready for review August 1, 2023 19:54
atomb
atomb previously approved these changes Aug 8, 2023
Copy link
Member

@atomb atomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little wary that merging HTML files rather than a simpler data format could be fragile, but there is something appealing about it from the UI standpoint: there's no difference between what the computer interprets and what the human reads.

@@ -86,6 +87,10 @@ private enum Mode {
"Print the Boogie code used during test generation.") {
ArgumentHelpName = "filename"
};
public static readonly Option<string> PrintCoverage = new("--print-coverage",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might call this --coverage-report, because to me --print-coverage suggests that it'll print it to standard output.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have changed the option to --coverage-report

public enum CoverageLabel {
FullyCovered,
NotCovered,
PartiallyCovered
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't require any changes right now, but I'm trying to think out how this would be used when we have multiple types of coverage. We could have expected test coverage, actual test coverage, and verification coverage. Those could be represented by a (CoverageType, CoverageLabel) pair, perhaps. And then a set of those pairs could be translated to a specific sort of highlighting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would make most sense. I will experiment with merging different coverage metrics - probably makes sense to separate that into its own PR

private static int nextUniqueId = 0;

// INVARIANT: CoverageSpans are sorted within each list by the position of the StartToken
private readonly Dictionary<string, List<CoverageSpan>> labeledFiles;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this name confusing. Maybe labelsByFile or something like that would be better?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@Dargones
Copy link
Collaborator Author

Dargones commented Aug 9, 2023

Thank you, @atomb! I am somewhat wary of the HTML parsing code as well but I think it would have to be done either way, unless there is some standard coverage format that fits all the criteria we need and an external tool that can automatically produce an html report from that same format?

@@ -354,6 +354,8 @@ public enum IncludesModes {
public bool UseStdin = false;
public bool WarningsAsErrors = false;
[CanBeNull] private TestGenerationOptions testGenOptions = null;
public List<string> CoverageReportsToMerge = new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you add these fields here instead of a location that's specific to coverage?

Copy link
Collaborator Author

@Dargones Dargones Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have now moved them to CoverageReportCommand.cs

@@ -0,0 +1,90 @@
body, td {
Copy link
Member

@keyboardDrummer keyboardDrummer Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this CLI UI and all related files be in DafnyDriver instead of DafnyCore?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if I were to move these files to DafnyDriver, I would have to move the coverage reporting code there as well, so that they can be accessed from within the same assembly? My reason to put the assets in DafnyCore was that this mimics the set up for the Auditor.

Copy link
Member

@keyboardDrummer keyboardDrummer Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if I were to move these files to DafnyDriver, I would have to move the coverage reporting code there as well, so that they can be accessed from within the same assembly?

Sounds good. I'm wondering if we should avoid having code that is clearly UI, and for which thus multiple implementations are possible, remain outside of Core. Core will always have some 'UI' in the form of error messages and descriptions that target users, but those seem at a lower level of ambiguity than choices like html+css vs a 3D model.

My reason to put the assets in DafnyCore was that this mimics the set up for the Auditor.

Isn't the auditor code a little all over the place?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, sorry, I think I misunderstood your original comment. I have now moved the CoverageReportCommand.cs and all the HTML parsing to DafnyDriver, with the tools for creating an in-memory coverage report remaining in Dafny Core.

private readonly Dictionary<string, List<CoverageSpan>> labelsByFile;
public readonly string Name; // the name to assign to this coverage report
public readonly string Units; // the units of coverage (plural). This will be written in the coverage report table.
private readonly int uniqueId = nextUniqueId++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't thread-safe, and I imagine we could at some point run multiple tests concurrency in a single process that would generate coverage reports.

Consider using Interlocked.Increment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

public readonly string Name; // the name to assign to this coverage report
public readonly string Units; // the units of coverage (plural). This will be written in the coverage report table.
private readonly int uniqueId = nextUniqueId++;
public string UniqueId => "." + (uniqueId == 0 ? "" : (uniqueId + ".")); // to add as postfix to files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need this static? If you want to prevent overwriting files, shouldn't you read the disk to prevent that? Or, if you generate multiple files per Dafny invocation, shouldn't you add a meaningful suffix to differentiate between those files?

Copy link
Collaborator Author

@Dargones Dargones Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the option for the user to specify a suffix in my latest commit, so the set up should be better now. When writing to disk, I always create a new directory, so overwriting shouldn't be a problem. Even so, the numerical suffix is still necessary because when you merge arbitrary coverage reports from disk, there must be some way to differentiate between them, and they might have been created with the same user-provided suffix.

Copy link
Member

@keyboardDrummer keyboardDrummer Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even so, the numerical suffix is still necessary because when you merge arbitrary coverage reports from disk, there must be some way to differentiate between them, and they might have been created with the same user-provided suffix.

Are you saying you can't merge files with the same filename that are in different directories? I think I misunderstand.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When serializing a report, I create an .html file for each .dfy file in the sources. When serializing several reports, I therefore have several .html files for a single original .dfy file and so I have to differentiate between them in some way. By default, I will use the user-provided suffix indicating what kind of coverage is being reported by each .html file. But if these suffixes are equal as well, I propose to use a numerical index.

@@ -13,6 +13,9 @@ namespace DafnyCore.CoverageReporter;

public class CoverageReportCommand : ICommandSpec {

public static List<string> CoverageReportsToMerge = new();
Copy link
Member

@keyboardDrummer keyboardDrummer Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but now you need statics, which aren't great either.

Could you allow generically storing arguments in DafnyOptions? I put a draft of this idea in this PR: https://github.com/dafny-lang/dafny/pull/4410/files

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this has been super helpful!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woop :) thanks. Closing the PR then.

@Dargones
Copy link
Collaborator Author

Thank you, @keyboardDrummer! I have now isolated all HTML parsing inside DafnyDriver, with all abstract functionality remaining in DafnyCore. I also made use of the code you linked above to store options generically - this has been super useful.

@keyboardDrummer
Copy link
Member

Thanks for changes. I would have moved the coverage code that's currently in Core to DafnyTestGeneration, since AFAIK that's the only place where it's used. I'm guessing you put it in core because you expect it will be used there in the future. However, it's cheap to move code later and I always prefer designing for today and adapting things to future needs when they come into play.

Also, I want to apologize for not having a crisp explanation of what should be in which assembly. My feeling is that we currently don't have the assembly responsibilities defined well. However, I think avoiding putting things in Core will make it easier to reorganize things later, so I generally advocate for that.

@Dargones
Copy link
Collaborator Author

Thank you, Remy! I have moved the files from DafnyCore to DafnyTestGeneration. Like you have said, it should be very easy to relocate them later if necessary. @keyboardDrummer, @atomb - could you reapprove/merge?

@keyboardDrummer keyboardDrummer enabled auto-merge (squash) August 17, 2023 10:23
@keyboardDrummer keyboardDrummer merged commit 9e18bfc into dafny-lang:master Aug 17, 2023
17 of 18 checks passed
keyboardDrummer added a commit to keyboardDrummer/dafny that referenced this pull request Sep 15, 2023
This PR adds functionality for emitting JaCoCo-style coverage reports
for Dafny code. A coverage report is a set of RangeTokens/CoverageLabel
pairs, which are rendered like in the image below:

<img width="606" alt="Screenshot 2023-07-24 at 1 56 12 AM"
src="https://github.com/dafny-lang/dafny/assets/21084554/6e91480c-9c16-4fe3-bac6-e5693416377b">

Each report is a time-stamped directory with an index.html file that
summarizes the coverage and contains links to all other files:

<img width="636" alt="Screenshot 2023-07-24 at 1 56 00 AM"
src="https://github.com/dafny-lang/dafny/assets/21084554/2f6a47e4-e922-40c4-b1f5-2dc02c74dad2">

Because there might be several notions of coverage in Dafny (coverage
that Dafny expects automatically generated tests to have, actual runtime
coverage on C#, Java, Python, etc.), I add a new command, called
`merge-coverage-reports` that takes in several existing coverage reports
and links them together so that it is easy to cross-reference them (see
images above).

<small>By submitting this pull request, I confirm that my contribution
is made under the terms of the [MIT
license](https://github.com/dafny-lang/dafny/blob/master/LICENSE.txt).</small>

---------

Co-authored-by: Aleksandr Fedchin <fedchina@amazon.com>
Co-authored-by: Remy Willems <rwillems@amazon.com>
keyboardDrummer added a commit that referenced this pull request Sep 19, 2023
This PR adds functionality for emitting JaCoCo-style coverage reports
for Dafny code. A coverage report is a set of RangeTokens/CoverageLabel
pairs, which are rendered like in the image below:

<img width="606" alt="Screenshot 2023-07-24 at 1 56 12 AM"
src="https://github.com/dafny-lang/dafny/assets/21084554/6e91480c-9c16-4fe3-bac6-e5693416377b">

Each report is a time-stamped directory with an index.html file that
summarizes the coverage and contains links to all other files:

<img width="636" alt="Screenshot 2023-07-24 at 1 56 00 AM"
src="https://github.com/dafny-lang/dafny/assets/21084554/2f6a47e4-e922-40c4-b1f5-2dc02c74dad2">

Because there might be several notions of coverage in Dafny (coverage
that Dafny expects automatically generated tests to have, actual runtime
coverage on C#, Java, Python, etc.), I add a new command, called
`merge-coverage-reports` that takes in several existing coverage reports
and links them together so that it is easy to cross-reference them (see
images above).

<small>By submitting this pull request, I confirm that my contribution
is made under the terms of the [MIT
license](https://github.com/dafny-lang/dafny/blob/master/LICENSE.txt).</small>

---------

Co-authored-by: Aleksandr Fedchin <fedchina@amazon.com>
Co-authored-by: Remy Willems <rwillems@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants