Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: for loops #4

Closed
wants to merge 2 commits into from
Closed

Conversation

RustanLeino
Copy link
Collaborator

No description provided.

``` dafny
var _lo := lo;
var _hi := hi;
assert _lo <= _hi; // this is Dafny-like

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does // this is Dafny-like mean?

Do we really need to assert _lo <= _hi? This could be needlessly restrictive: I think there are some situations where lo might already be bigger than hi, and the intended behavior is simply to do no loop iterations at all if that's the case.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree. It is really useful if the loop simply does nothing when lo > hi. Note: I understand it will do nothing when lo == hi, the interval being half-open.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this would be "Dafny-like" because Dafny already imposes the restriction lo <= hi in an expression s[lo..hi].

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is an example when allowing lo > hi is useful? Is it a frequent occurrence?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think lo == hi (which should result in 0 iterations) is very common, and lo > hi is less common, but here's an example: Say I write a method to resize an array from oldsize to newsize (and I don't know which one is bigger), and if newsize is bigger, the new elements should be initialized to some default value, so after copying the elements, maybe I want to be "smart" and write a loop like for i in [oldsize..newsize] { newarray[i] = default } and run that loop no matter whether oldsize or newsize is bigger, relying on the fact that if oldsize is bigger, 0 iterations will be performed. But I guess examples like this one can be easily rewritten so that they also work if lo <= hi is required by Dafny, I can't think of an example right now where such rewriting would be too cumbersome.

Copy link
Collaborator Author

@RustanLeino RustanLeino Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thoughts. For this particular example, one can also use a forall statement:

forall i | 0 <= i < newsize {
  newarray[i] := if i < oldsize then oldarray[i] else default;
}

@samuelgruetter
Copy link

Did you consider making all for-loops iterator-based (like Python)?
This would allow using the same construct to iterate through collections as well as through ranges, and ranges could also specify a step size:

for i in myList.iterator()
  invariant J
{
  ...
}

// starts at 10, decrements by 2, as long as i > 0
for i in range(10, 0, -2)
{
  print(i);
}

Two central design points are:
- the half-open interval of values for which the loop body is performed (e.g., `0 <= i < a.Length`) and
the closed interval that appears in the loop invariant (e.g., `0 <= i <= a.Length`)
- what should be the possible increments of the loop index (e.g., `+1`, `-1`, `+2`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, increments other than +1 and -1 are a non-issue. It happens rarely and when it does, we simply use a while loop.

}
```

is still available.
Copy link

@backtracking backtracking Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, Why3 has a for each loop with syntax

for <pat> in <expr> with <Module> do invariant {...} variant {...} <body> done

It roughly desugars to

let it = <Module>.create <expr> in
try while true do invariant {...} variant {...} let <pat> = <Module>.next it in <body> done
with <Module>.Done -> () end

which means that <Module> is providing

  • a function create (to create the iterator from <expr>)
  • a function next (to get the next value)
  • an exception Done (to signal termination of the iteration)

Note: break and continue are honored.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the way to specify create and next, you might be interested in this NFM 2016 paper:
https://hal.inria.fr/hal-01281759v2/document

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason not to use just foreach loops (using the for keyword seems fine)?

[lo .. hi] in for i in [lo .. hi] is already in Dafny. That'd allow for arbitrary stepping using sequence comprehensions such as in for x in seq(5, i => i*i).

The versions with just -1 and +1 updates seem restrictive to me.

@RustanLeino
Copy link
Collaborator Author

@samuelgruetter @backtracking @mschlaipfer I like the idea of also supporting a foreach loop. (I also really like the <pat> LHS in Why3 -- great idea!)

Dafny has CLU-style iterators. However, they are rarely used, because of some shortcomings. One shortcoming is that they are tricky to specify in general, though simple iterators are fairly easy to specify. Another shortcoming is that iterators don't have a common supertype (other than object). This would be useful to have. The most severe shortcoming is that there is no built-in foreach construct. This means users have to write a white loop of the correct form, and this requires knowing--even for simple iterators--about the various special fields that are there to support the general case. A built-in foreach loop could fix this.

(The Why3 use of exceptions as part of the foreach definition makes me think of Dafny's :-. Maybe there's something cool one can do here, but I don't see exactly what it is.)

The need has also come up for foreach loops over Dafny's built-in collection types (set, map, etc.). It would be nice to combine such a foreach loop with a foreach loop for user-defined iterators.

Finally, I don't know what reason Why3 had to have separate for and foreach loops, but I'm feeling some resistance to the idea of using foreach loops for everything. The automatically generated invariant is more complicated for foreach loops, so from a verification learning standpoint, I think the primitive nature of for is simpler. For example, if the foreach is over a set, like

foreach x in S {
  Body;
}

then you'd want something like this:

var tmpS := S;
var tmpDone := {};
while tmpDone != tmpS
  invariant tmpDone <= tmpS
  decreases |tmpS| - |tmpDone|
{
  var x :| tmpS - tmpDone;
  Body;
  tmpDone := tmpDone + {x};
}

There probably also needs to be some syntax for mentioning what I here called tmpDone.

So, I'm thinking for and foreach would be different statements. There's room for a RFC for a foreach.

@backtracking
Copy link

Finally, I don't know what reason Why3 had to have separate for and foreach loops, but I'm feeling some resistance to the idea of using foreach loops for everything.

Why3 has for and foreach for historical reasons: we had for long before we added foreach. Anyway, I would not make for a particular case of foreach (over an interval) as, as you said, this would complicate the loop invariant a lot. With a for loop, the invariant only depends on the loop index, an integer, which is at hand. It is named in tbe code. With a foreach loop, you have to refer to the sequence of the values already traversed / yet to be traversed (or both), and you have to provide a way to name them. Because of nested loop, you can't use fixed names for these sequences (such as done / todo for instance).

@RustanLeino
Copy link
Collaborator Author

Here's another question. For the most typical for loop (with a +1 increment), using

for i := 0 to 100

seems straightforward. But what about the -1 direction. Previously, I had thought of

for i := 100 downto 0

The order of the two bounds (100 and 0 in my example here) is suggestive of the order in which the iterations will be executed. But here's another choice:

for i := 0 from 100

Here, the lower/upper bounds are written in the same order as in the standard to (+1) case. This has the advantage that the inclusive bound is still on the left and the exclusive bound is on the right.

Is the 0 from 100 to be preferred over 100 downto 0?

@backtracking
Copy link

backtracking commented Apr 13, 2021

I definitely prefer the downto version, but I'm obviously influenced by almost 30 years of OCaml and the fact that we adopted the OCaml syntax in Why3.

By the way, here is a good use example of downto: in place heapsort.

let heapsort a =
  let n = Array.length a in
  for k = (n-2) / 2 downto 0 do downheap a k n done;
  for k = n-1 downto 1 do swap a 0 k; downheap a 0 k done

where downheap a i hi moves a[i] down in the heap using hi as (exclusive) upper limit.

Note that, when n is 0 (empty array),

  • the first loop goes from -1 to 0 and thus does nothing;
  • the second loop goes from -1 to 1 and thus does nothing. This is an example of a 2-gap by the way.

@samuelgruetter
Copy link

0 from 100 looks a bit unintuitive to me. A more intuitive but also more verbose option might be 0 to 100 backwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants