Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CBMC incorrectly identify nonexistent loop #7506

Open
celinval opened this issue Jan 26, 2023 · 13 comments
Open

CBMC incorrectly identify nonexistent loop #7506

celinval opened this issue Jan 26, 2023 · 13 comments
Assignees
Labels
aws Bugs or features of importance to AWS CBMC users More info needed

Comments

@celinval
Copy link
Collaborator

CBMC version: 5.75.0
Operating system: Ubuntu 22.04
Exact command line resulting in the issue: cbmc no_loops.c --unwind 1 --unwinding-assertions
What behaviour did you expect: The assertion main.assertion.1 in the code should fail.
What happened instead: The assertion main.assertion.1 succeeds, while there's a unexpected unwinding assertion failure.

Source code:

// no_loops.c
int loop_free() {
bb0:;
    int var_3 = 1;
    goto bb1;

exit:;
    return 10;

bb1:;
    goto exit;
}

int main() {
    assert(loop_free() == 0);
    return 0;
}

Output:

** Results:
no_loops.c function loop_free
[loop_free.unwind.0] line 11 unwinding assertion loop 0: FAILURE

no_loops.c function main
[main.assertion.1] line 16 assertion return_value_loop_free == 0: SUCCESS

** 1 of 2 failed (2 iterations)
VERIFICATION FAILED
@celinval celinval added aws Bugs or features of importance to AWS CBMC users Kani Bugs or features of importance to Kani Rust Verifier labels Jan 26, 2023
@celinval
Copy link
Collaborator Author

It looks like CBMC identifies the edge bb1 -> exit as a back edge because their location in the code, when that really shouldn't matter.

@celinval celinval changed the title CBMC incorrectly identify unexistent loop CBMC incorrectly identify nonexistent loop Jan 26, 2023
@tautschnig
Copy link
Collaborator

It looks like CBMC identifies the edge bb1 -> exit as a back edge because their location in the code, when that really shouldn't matter.

It is true that this doesn't match Muchnick's definition of a back edge, because we are going "back" (in location order) to a node we haven't even seen before. Fixing symbolic execution alone would be fairly easy (we would just need to keep track of which nodes we have seen already, although there is memory cost to this), but then it would be much harder to label loops.

But ... what is the real problem here? The example works just fine when not specifying an unwinding bound, or using any bound greater or equal to 2.

@martin-cs
Copy link
Collaborator

@celinval I have my doubts about how robust our loop detection and handling is against unusually structured and adversarial code. I think symex handles it reasonably but other parts of the system are more fragile. If it's something you want to pursue I have a collection of horrible examples that I can share! I have hopes that we might have someone working on this soon.

@celinval
Copy link
Collaborator Author

But ... what is the real problem here? The example works just fine when not specifying an unwinding bound, or using any bound greater or equal to 2.

  1. This logic is fragile. Compiler transformations can affect the order of basic blocks declaration just because they tend to append the new basic block to the list of existing ones. But their CFG and loops are still well formed. I bumped into this issue after updating the rust compiler that is used by Kani (Upgrade toolchain to nightly-2023-01-23 model-checking/kani#2149). A new loop iteration showed up out of the blue and it's actually breaking our customer harness because the bound that they had previously set up no longer holds.
  2. This is not intuitive either. I always struggled to figure out a good bound for my harnesses, and we literally tell users to add 2 to the number they think should be used just in case. I wonder if some cases are related to how CBMC is identifying loops in the first place.

@martin-cs
Copy link
Collaborator

@celinval makes a good point, it would be good if we could do better loop detection. Maybe even have a loop canonicalisation pass? I guess the question is who implements it.

@celinval
Copy link
Collaborator Author

BTW, I like how LLVM terminology defines loops and cycles as two different things. But both of them are based on the CFG, not the order of their occurrence.

@celinval
Copy link
Collaborator Author

@celinval makes a good point, it would be good if we could do better loop detection. Maybe even have a loop canonicalisation pass? I guess the question is who implements it.

Yes. I think having a pass to handle loops is a great idea. It would also likely mitigate @tautschnig concern of memory increase, since it should be a simple DFS, and we can just cache the final result.

In this case, can we label all cycle / loop entries?

@remi-delmas-3000
Copy link
Collaborator

remi-delmas-3000 commented Jan 27, 2023

But ... what is the real problem here?

goto-instrument --show-natural-loops shows no loops on this example, and goto-instrument --show-loops shows a loop (jump from bb1 to exit), and it seems like most users assimilate loops with natural loops, making it hard to know what to do in that situation.

@martin-cs there is a similar discussion going on about loop identification and canonicalisation from the perspective of loop contract instrumentation, with the additional needs that goto-instructions forming natural loops are densely packed in the instruction sequence without intermediate non-loop instructions, so that the loop instructions sequence forming the loop can easily be replaced by its base ; havoc ; step model for loop contract verification.

Could we take inspiration from LLVM and at try to rewrite the goto-instruction sequence so that

  • natural loops are in LoopSimplifyForm,
  • re-linearise the CFG of instructions into a sequence such that the only backedges (from the linear goto instruction perspective) are the backedges of normalised natural loops ?

@celinval
Copy link
Collaborator Author

It sounds good to me. As long as the loop identification is done based on the CFG, and we don't end up rewriting cases like this where there is no loop.

@celinval
Copy link
Collaborator Author

I do have a question though, is there an easy fix to solve the issue at hand while we wait for the more robust solution?

@martin-cs
Copy link
Collaborator

martin-cs commented Jan 27, 2023 via email

@martin-cs
Copy link
Collaborator

Oh, forgot to add, other tools that need this / are probably broken by some of the horrible loop test cases : loop acceleration, cprover conversion to Horn Clauses, 2ls's abstraction (I know @peterschrammel had a lot of "fun" fixing some of these in that, especially breaks IIRC), ummm... probably some others too.

@thomasspriggs
Copy link
Collaborator

My understanding is that a loop canonicalization pass may be able to resolve this issue. I am commenting here in order to link the canonicalization ticket - #7211

@feliperodri feliperodri removed aws-high Kani Bugs or features of importance to Kani Rust Verifier labels Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Bugs or features of importance to AWS CBMC users More info needed
Projects
None yet
Development

No branches or pull requests

7 participants