-
Notifications
You must be signed in to change notification settings - Fork 169
Some algorithm benchmarks written in Java. #823
Conversation
…updated the BellmanFord-MemUnsat* algorithms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Zafer for adding these interesting benchmarks.
I tried to review them, but found a couple of issues that need to be addressed before merging.
In addition, I think these benchmarks raise a couple of interesting question regarding which new properties should be introduced for the Java SV-Comp track. Not all files are checking assert properties. Could you clarify for these examples what the target property is ? Eventually, we could introduce a proposal how to specify these target properties for future versions of the SV-Comp Java track as a second step.
In general, I would suggest to split the PR into smaller pieces of files that should be ready to merge soon and a second PR with files that involve more discussion. This might shorten the time required to get some of the test cases into the code base.
final int V = Verifier.nondetInt(); | ||
if (V <= 0) return; // change with Verifier.assume? | ||
|
||
final int D[] = new int[V*V]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Site notice:
Until now, I think just assert statements are allowed to specify reachability properties in the Java track but it would be interesting to check here for the possible overflow in the execution.
Specifying the reachability of certain Standard Java-Exceptions might be one option report this overflow or we might add an assert statement that ensures no overflow before the int.
nit-pick:
INFINITY < V
might occur using Verifier.nondetInt()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the change INFINITY = Integer.MAX_VALUE
should resolve this:
nit-pick:
INFINITY < V
might occur usingVerifier.nondetInt()
Adding the assumptions I've mentioned in another comment should resolve the overflow issue I think:
- assume that V is at most 1e6, which should still be almost impossible to verify using bounded methods and cause no overflows in the calculations
- assume the same with the initialization of D at line 98
Specifying the reachability of certain Standard Java-Exceptions might be one option report this overflow or we might add an assert statement that ensures no overflow before the int.
We have assumed that the verifier should check the reachability of these standard exceptions. If this is not suitable, maybe we can wrap the potentially unsafe code with try-catch blocks, with assert false; in the catch statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is my understanding that currently only explicit assert violations are checked in this category according to the property file. Using a try-catch blog sounds legit to me. But I might be wrong.
Otherwise it might be possible to define a new property for checking the standard errors on a program and use such a property along with this file. I am not sure, what would be the better approach.
Eventually @peterschrammel or @dbeyer might give some guidance for the better solution here.
for (int i = 0; i < V; i++) { | ||
for (int j = 0; j < V; j++) { | ||
if (i == j) continue; | ||
D[i*V+j] = Verifier.nondetInt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the complete Integer space in the weight function in combination with the standard Java integer arithmetic will lead to potentially tons of overflows in line 67 and 68 that impact the algorithm's functional correctness.
Nevertheless, if a run terminates the assert in line 106 might hold while the complete result is wrong regarding the requirements of the BellmanFord algorithm.
In my personal Opinion, this example needs a stronger guard for checking the functional correctness, or more asserts/assumes guarding the arithmetic to be valuable to the SV-Comp Java benchmark set. But this comment is only intended to start a discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how we can strengthen the assertions without making them almost impossible to verify. Maybe we can add two assumptions?
- assume that V is at most 1e6, which should still be almost impossible to verify using bounded methods and cause no overflows in the calculations
- assume the same with the initialization of
D
at line 98
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the assumptions might be fine for now. Nevertheless, being able to verify such functional correctness properties and find a way to express them might be an interesting challenge in the long term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the assumptions to BellmanFord benchmarks for now.
@@ -0,0 +1,104 @@ | |||
import org.sosy_lab.sv_benchmarks.Verifier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my point of view, this example is missing an assert statement. What is the property that should be checked here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmarks with the extension MemSat and MemUnsat are all without explicit assertions, and the property being checked is that there are no thrown standard Java exceptions. As you've mentioned in another comment we have assumed that the verifier should check the reachability of these standard exceptions.
If this is not suitable, maybe we can wrap the potentially unsafe code with try-catch blocks, with assert false;
in the catch statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for changing the benchmarks.
@@ -0,0 +1,123 @@ | |||
import org.sosy_lab.sv_benchmarks.Verifier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my point of view, this file is missing an assert statement. What is the property, that should be proved here?
@@ -0,0 +1,123 @@ | |||
import org.sosy_lab.sv_benchmarks.Verifier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my point of view, this file is missing an assert statement. What is the property, that should be proved here?
*/ | ||
|
||
// IterativeMergeSort.java | ||
// By David Kosbie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not a proper license. Might you clarify it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tracked the original source of the benchmark to here, but I cannot really see any mention of a license either. I can separate the MergeSort algorithms from the pull request until I can resolve this issue.
|
||
// IterativeMergeSort.java | ||
// By David Kosbie | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not a proper license. Might you clarify it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
@@ -0,0 +1,109 @@ | |||
import org.sosy_lab.sv_benchmarks.Verifier; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment: I think all files need a clear comment under which license the modifications are published.
Further, there should be a clear indication whether the files is modified or not and how is the author of the modifications.
@zafer-esen Could you please complete this pull request such that it can be merged? |
Hi @dbeyer, I have just pushed a new commit covering most of the requested changes. I have not done anything regarding the issue in this comment. Depending on your feedback, I can add the try-catch blocks in the memory safety benchmarks in this pull request or separately. Addiiton of more benchmarks with different assertion strength levels proposed in this comment is also not in the latest commit. I can maybe create another pull request for those? |
@zafer-esen I would say yes to both comments (yes, in the competition we check (currently) only for asserts that are reached and violated, yes, please add the variants, but in a new pull request). |
…est memory safety
@dbeyer and @mmuesly, thank you both for your comments. I have added the assertions via try-catch blocks to all the benchmarks which lacked these, in the latest commit. Regarding this comment, I have not deleted the MergeSortIterative* benchmarks as I have contacted the original author David Kosbie (koz@cmu.edu), and he kindly granted permission to freely use this code. I have added this info to the headers of the relevant benchmarks, if this will suffice. I will create another pull request for the sorting algorithm benchmaks, with assertions which test stronger properties. |
@mmuesly Could you please have another look? Is this pull request ready to be merged from your point of view? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finished a second round of review.
The modifications made by @zafer-esen look good to me. I think the benchmarks are following the SV-Comp format now and might be merged. Thank you for the additional work invested.
@dbeyer I would love to add the following point to the wish list for future SV-Comp Java infrastructure arising from the discussion about this PR:
Integrate property definitions for standard exceptions. There are various examples in the code base of SV-Comp triggering standard exceptions. As these are standard exceptions, Java verification tools should be able to detect them from my point of view. It would be interesting to make this also part of the competition.
Does CI make sure these test cases are compilable? I haven't checked this myself, but we might fix this along the way, if problems arise. This PR contributes some examples using more than one Java file. I think this is the first time I have seen this in the Java track. Therefore, we should check that all tools are able to deal with these cases from my point of view, even if the rule allow this explicitly.
@@ -0,0 +1,104 @@ | |||
import org.sosy_lab.sv_benchmarks.Verifier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for changing the benchmarks.
CI checks test cases if they are added to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
programs added to new and appropriately named directory
license present and acceptable (either in separate file or as comment at beginning of program)
contributed-by present (either in README file or as comment at beginning of program)
programs added to a
.set
file of an existing category, or new sub-category established (if justified)intended property matches the corresponding
.prp
fileprograms and expected answer added to a
.yml
file according to task definitions.cfg
file