Skip to content

Commit

Permalink
Chapter 1 - Loops (#12)
Browse files Browse the repository at this point in the history
* chore: add section on loops

range section missing because clarifications are needed.

* fix: add binarySearch to dict

* chore: init range section

* chore: update range

* chore: triggers, reword, grammar
  • Loading branch information
gottschali authored Dec 13, 2024
1 parent f416eb0 commit 1debf58
Show file tree
Hide file tree
Showing 6 changed files with 288 additions and 0 deletions.
1 change: 1 addition & 0 deletions ci/dictionary.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
personal_ws-1.1 en 0 utf-8
binarySearch
Collatz
Datatypes
Errorf
Expand Down
6 changes: 6 additions & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,12 @@
- [Installation](./install.md)
- [Getting started](./getting-started.md)
- [Basic specifications](./basic-specs.md)
- [`assert` and `assume`](./assert-assume.md)
- [requires, ensures, and preserves](./requires-ensures.md)
- [Loops](./loops.md)
- [Invariants](./loops-invariant.md)
- [Binary Search](./loops-binarysearch.md)
- [Range](./loops-range.md)
- [Ghost code](./ghost.md)

# Advanced topics
167 changes: 167 additions & 0 deletions src/loops-binarysearch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# Binary Search

We will follow along the example of function `binarySearch(arr [N]int, value int) (idx int)` that efficiently finds the rightmost position `idx` in a sorted array of integers `arr` where `value` would be inserted according to the sorted order.
Our approach is to gradually add specifications and fix errors along the way.
If you want to see the final code only you can skip to the end of this chapter.

Let us begin by writing the specification.
First, we must require the `arr` to be sorted.
We have already seen how we can write this as a precondition:
``` go
requires forall i, j int :: 0 <= i && i < j && j < len(arr) ==> arr[i] <= arr[j]
```

To understand the the behavior we want to achieve we can look at a small example.
Note that for values like `1` contained in `arr` we want the index just after the last occurrence of that value to be returned.
``` go
arr := [7]int{0, 1, 1, 2, 3, 5, 8}
binarySearch(arr, -1) == 0
binarySearch(arr, 0) == 1
binarySearch(arr, 1) == 3
binarySearch(arr, 8) == 7
binarySearch(arr, 9) == 7
```
<!-- we can't assert them with our version... -->

All values to the left of `idx` should compare less or equal to `value`
and all values from `idx` to the end of the array should be strictly greater than `value`.
``` go
ensures forall i int :: 0 <= i && i < idx ==> arr[i] <= value
ensures forall j int :: idx <= j && j < len(arr) ==> value < arr[j]
```
Something is still missing.
An issue is that the `Index j into arr[j] might be negative` since we only have `idx <= j` and no lower bound for `idx`.
Similarly, the `Index i into arr[i] might exceed sequence length`.
After constraining `idx` to be between `0` and `N` we are ready to implement `binarySearch`:
``` go
// @ requires forall i, j int :: 0 <= i && i < j && j < N ==> arr[i] <= arr[j]
// @ ensures 0 <= idx && idx <= N
// @ ensures forall i int :: 0 <= i && i < idx ==> arr[i] <= value
// @ ensures forall j int :: idx <= j && j < len(arr) ==> value < arr[j]
func binarySearch(arr [N]int, value int) (idx int) {
low := 0
high := N
mid := 0
for low < high {
mid = (low + high) / 2
if arr[mid] <= value {
low = mid + 1
} else {
high = mid
}
}
return low
}
```
We will have to add several invariants until we can verify the function.

First Gobra complains that the `Index mid into arr[mid] might be negative.`

So our first invariant is, that `mid` remains a valid index for `arr`:
``` go
// @ invariant 0 <= mid && mid < N
```
Let's check manually if this invariant works.
1. Before the first iteration `mid` is initialized to `N / 2` hence `0 <= N / 2 && N / 2 < N` trivially holds.
2. For an arbitrary iteration, we assume that before this iteration `0 <= mid && mid < N` was true. Now we need to show that after updating `mid = (low + high) / 2` the invariant is still true (the rest of the body does not influence mid). But we fail to do so as the invariant does not capture the values `low` and `high`.

So let us add what we know about `low` and `high` to help the verifier.
``` go
// @ invariant 0 <= low && low < high && high < N
// @ invariant 0 <= mid && mid < N
```
With this change we fixed the *Index-Error* but are confronted with a new error.
``` text
Loop invariant might not be preserved.
Assertion low < high might not hold.
```
While `low < high` is true before the first iteration (assuming `N>0`)
and holds by the loop condition at the beginning of every except the last iteration.
But an invariant must hold after every iteration, including the last.
This is achieved by changing `low < high` to `low <= high`.

Note that after exiting the loop we know `!(low < high)` because the loop condition must have failed and `low <= high` from the invariant.
Together this implies `low == high`.

Our next challenge is:
``` text
Postcondition might not hold.
Assertion forall i int :: 0 <= i && i < idx ==> arr[i] <= value might not hold.
```

So we need to find assertions that describe which parts of the array we have already searched.
The goal is that after the last iteration the invariants together with `low == high` should be able prove the postconditions.

For this step it is useful to think about how binary search works.
The slice `arr[low:high]` denotes the part of the array we still have to search for and which is halved every iteration.
In the prefix `arr[:low]` are no elements larger than `value`
and in the suffix `arr[high:]` no elements smaller or equal than `value`.
Exemplified for `binarySearch([7]int{0, 1, 1, 2, 3, 5, 8}, 4)`:

| `low` | `mid` | `high` | `arr[low:]` | `arr[high:]` |
|-------|-------|--------|---------------|--------------|
| 0 | 0 | 7 | [] | [] |
| 4 | 3 | 7 | [0 1 1 2] | [] |
| 6 | 5 | 7 | [0 1 1 2 3 5] | [] |
| 6 | 6 | 6 | [0 1 1 2 3 5] | [8] |

Translating the above into Gobra invariants gives:
``` go
// @ invariant forall i int :: 0 <= i && i < low ==> arr[i] <= value
// @ invariant forall j int :: high <= j && j < N ==> value < arr[j]
```

When the function returns, `idx == low` holds and as discussed above also `low == high`.
We can clearly see that `low` and `high` with `idx` yields the postconditions:

``` go
// @ ensures forall i int :: 0 <= i && i < idx ==> arr[i] <= value
// @ ensures forall j int :: idx <= j && j < len(arr) ==> value < arr[j]
```

Now we can be happy because `Gobra found no errors`.

We will see `binarySearch` search again when we look at termination and do overflow checking.

## Full Example

``` go
// @ requires forall i, j int :: 0 <= i && i < j && j < N ==> arr[i] <= arr[j]
// @ ensures 0 <= idx && idx <= N
// @ ensures forall i int :: 0 <= i && i < idx ==> arr[i] <= value
// @ ensures forall j int :: idx <= j && j < len(arr) ==> value < arr[j]
func binarySearch(arr [N]int, value int) (idx int) {
low := 0
high := N
mid := 0
// @ invariant 0 <= low && low <= high && high <= N
// @ invariant forall i int :: 0 <= i && i < low ==> arr[i] <= value
// @ invariant forall j int :: high <= j && j < N ==> value < arr[j]
// @ invariant 0 <= mid && mid < N
for low < high {
mid = (low + high) / 2
if arr[mid] <= value {
low = mid + 1
} else {
high = mid
}
}
return low
}
```

<!-- Client Code -->
<!-- ``` go -->
<!-- // @ requires forall i, j int :: 0 <= i && i < j && j < len(arr) ==> arr[i] <= arr[j] -->
<!-- // @ ensures found == -1 ==> forall i int :: 0 <= i && i < len(arr) ==> arr[i] != value -->
<!-- // @ ensures found != -1 ==> 0 <= found && found < len(arr) && arr[found] == value -->
<!-- func find(arr [N]int, value int) (found int) { -->
<!-- idx := binarySearch(arr, value) -->
<!-- if idx == 0 || arr[idx-1] != value { -->
<!-- return -1 -->
<!-- } else { -->
<!-- return idx - 1 -->
<!-- } -->
<!-- } -->
<!-- ``` -->

38 changes: 38 additions & 0 deletions src/loops-invariant.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Invariants

An invariant is an assertion that is preserved by the loop across iterations.

The loop invariant is valid if it holds...
1. before the first iteration after performing the initialization statement
2. after every iteration
3. when exiting the loop

In Gobra we can specify it with the `invariant` keyword before a loop.
``` go
//@ invariant ASSERTION
for condition {
// ..
}
```

Similarly to `requires` and `ensures` you can split an `invariant` on multiple lines.

<!--
``` gobra
func client() {
{ // to limit the scope
i := 0 // hoisted initialization
assert INV
invariant INV
for ; i < N; i++ {
BODY // assuming no jumps outside
assert INV
}
assert INV
}
// assert INV // may fail here, could depend on i that is out of scope
}
```
-->
65 changes: 65 additions & 0 deletions src/loops-range.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Range Loops
Go supports loops iterating over a range clause.
In this section we use arrays, later we will also iterate over slices and maps.
Gobra does not support ranges over integers, strings, slices, and functions.


We are given code that verifies containing the following loop iterating over an integer array:
``` go
// @ requires len(arr) > 0
// @ ensures forall i int :: {arr[i]} 0 <= i && i < len(arr) ==> res >= arr[i]
func almostMax(arr [N]int) (res int) {
//@ invariant 0 <= i && i <= len(arr)
//@ invariant forall k int :: {arr[k]} 0 <= k && k < i ==> res >= arr[k]
for i := 0; i < len(arr); i += 1 {
if arr[i] > res {
res = arr[i]
}
}
return
}
```
But if we refactor it using a `range` clause we face an error:
``` go
~// @ requires len(arr) > 0
~// @ ensures forall i int :: {arr[i]} 0 <= i && i < len(arr) ==> res >= arr[i]
~func almostMax(arr [N]int) (res int) {
//@ invariant 0 <= i && i <= len(arr)
//@ invariant forall k int :: {arr[k]} 0 <= k && k < i ==> res >= arr[k]
for i, a := range arr {
if a > res {
res = a
}
}
~return
~}
```
``` text
Postcondition might not hold.
Assertion forall i int :: 0 <= i && i < len(arr) ==> res >= arr[i] might not hold.
```
For loops, the general pattern is that the negation of the loop condition together with the invariant imply the postcondition.
In the standard for loop, we can deduce that `i == len(arr)` after the last iteration while
it is `len(arr)-1` in the range version.

We can specify an additional loop variable defined using `with i0` after `range`.
The invariant `0 <= i0 && i0 <= len(arr)` holds as well as `i0 < len(arr) ==> i == i0`.
Additionally, `i0` will be equal to `len(arr)` at the end of the loop.
Hence if we replace `i` with `i0` in the second invariant, Gobra can infer the postcondition.
We no longer need the invariant constraining `i` and our final verifying version is:
``` go
~package main
~const N = 42
// @ requires len(arr) > 0
// @ ensures forall i int :: {arr[i]} 0 <= i && i < len(arr) ==> res >= arr[i]
func almostMax(arr [N]int) (res int) {
res = arr[0]
//@ invariant forall k int :: {arr[k]} 0 <= k && k < i0 ==> res >= arr[k]
for _, a := range arr /*@ with i0 @*/ {
if a > res {
res = a
}
}
return
}
```
11 changes: 11 additions & 0 deletions src/loops.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Loops

Reasoning with loops introduces a challenge for verification.
Loops could execute an unbounded number of iterations.
Similar to pre and postconditions for functions, we have to write a specification for a loop in the form of an *invariant*.

Using the example of a binary search, we showcase how we can gradually build up the invariant until we are able to verify the function.

We will also introduce the verification of loops over a `range`.

Termination of loops will be discussed in the [next section](./termination.md).

0 comments on commit 1debf58

Please sign in to comment.