You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wanted to mathematically prove the correctness of an algorithm in range-set-blaze, a Rust crate. Rust-specific solutions seemed too hard to use or not good for general algorithmic verification. So, I ported the algorithm to Dafny and (with help from Divyanshu Ranjan) completed the proof.
We've written up our experience in a free article in Towards Data Science/Medium. It goes over what we learned with examples and tips in the form of "rules". Some might read our write up and be discouraged that the process is not easier or more automatic. I, however, am encouraged that the process is possible at all.
The biggest surprise? That a working validation can stop working because the random search changes. That led to Rule 9: "Rework Your Validation for Reliability.".
Dafny folks may also be interested in our conclusion. Here is an excerpt:
Aside: Since geometry class — in high school — I’ve found math proofs fascinating and frustrating. “Fascinating” because a math theorem once proven is known true forever. (Euclid’s geometry is still considered true. Aristotle’s physics is not.) “Frustrating” because my math classes always seemed vague about which axioms I could assume and how big of steps my proof could take. Dafny and similar systems remove this vagueness with automatic proof checking. Even better, from my point of view, they help us create proofs about an area for which I care deeply: algorithms.
When is it worth doing a formal proof of an algorithm? Given the work involved, I will only do this again when the algorithm is some combination of tricky, important, or easy-to-prove.
How might the process improve in the future? I’d love to see:
Interchange between systems — A geometry theorem once proven need never be proven again. I’d love if the systems checking algorithmic proofs could use each other’s proofs.
An all-Rust system as easy to use as Dafny — For work in this direction, see [1,2].
Aside: Do you know of an easy-to-use Rust validation system? Please consider applying it to the validation of internal_add. This would let us compare the Rust system’s easy-of-use and power to Dafny’s.
The proof analog of Rust’s Cargo.lock files — In Rust, we use the Cargo.lock to lock in a known-good combination of project dependencies. I wish when Dafny found a way to prove, for example, a method, that it would lock in the proof steps it found. This could make validation more reliable.
Better AI for validation — My intuition is that ChatGPT, slightly improved, could be good at creating 90% of needed validation code. I find current ChatGPT 4 poor with Dafny, I assume for lack of Dafny training examples.
Better validation for AI —When an AI generates code, we worry about the code’s correctness. Formal validation could help by proving correctness. (For a small example of this, see my article Check AI-Generated Code Perfectly and Automatically.)
Thanks to the Dafny community for both Dafny itself and for answering my many questions while working on this project!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I wanted to mathematically prove the correctness of an algorithm in range-set-blaze, a Rust crate. Rust-specific solutions seemed too hard to use or not good for general algorithmic verification. So, I ported the algorithm to Dafny and (with help from Divyanshu Ranjan) completed the proof.
We've written up our experience in a free article in Towards Data Science/Medium. It goes over what we learned with examples and tips in the form of "rules". Some might read our write up and be discouraged that the process is not easier or more automatic. I, however, am encouraged that the process is possible at all.
The biggest surprise? That a working validation can stop working because the random search changes. That led to Rule 9: "Rework Your Validation for Reliability.".
Dafny folks may also be interested in our conclusion. Here is an excerpt:
Thanks to the Dafny community for both Dafny itself and for answering my many questions while working on this project!
--Carl
p.s. The "Rules":
Beta Was this translation helpful? Give feedback.
All reactions