Sorting primitives #1032

weaversa · 2021-01-12T22:13:19Z

I don't know how useful these would be (outside of transposition ciphers), but wanted to drop them here for comment anyway. I'm also thinking it would be nice to have some primitives that give users the "feeling" of having variable length sequences.

/** Swap `i`th and `j`th entries of sequence `a` via `@@`/`updates` */
swap:
    {n, a, w}
    Integral w =>
    [n]a -> w -> w -> [n]a
swap seq i j = updates seq [i,j] (seq @@ [j,i])

/**
 * "Partition" a sequence `seq` by a filtering predicate `f` such
 * that the output `seq'` has all the items satisfying `f`, followed
 * by all items not satisfying `f`
 */
partition: {n, a} (fin n) => (a -> Bit) -> [n]a -> [n]a
partition f seq = take (last out).0
  where
    out = ([(seq, 0, 0)] : [1]([n]a, [max 1 (width n)], [max 1 (width n)]))
        # [ if f (w' @ i) then (w', i', j)
             | j <= i    then (w', i, i')
             | f (w' @ j)  then (swap w' i j, i', j')
            else (w', i, j')
            where
                i' = i + 1
                j' = j + 1
          | (w', i, j) <- out
          | _ <- tail [0 .. n : [width n]] ]

/**
 * A recursive version of partition
 */
partition':
    {n, a} fin n =>
    (a -> Bit) -> [n]a -> [n]a
partition' f w =
    if `n == (0: [width n]) then w
    | ~ (f (w @ 0)) then partition' f (take`{max 1 n - 1} (w <<< 1)) # (take`{min 1 n} [w @ 0])
    else (take`{min 1 n} [w @ 0]) # partition' f (drop`{min 1 n} w)

Main> let S = [1,2,3,4,5,6,0,0,7,0,0,0]
Main> partition' (\x -> x != 0) S
Showing a specific instance of polymorphic result:
  * Using 'Integer' for 1st type argument of '<interactive>::S'
[1, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0, 0]
Main> partition (\x -> x != 0) S
Showing a specific instance of polymorphic result:
  * Using 'Integer' for 1st type argument of '<interactive>::S'
[1, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0, 0]

And now that I look at it, I think all the [width ...] could be made into Integer instead. Though, maybe it's best to stick with bitvectors when possible...

The text was updated successfully, but these errors were encountered:

brianhuffman · 2021-01-13T00:53:39Z

Perhaps a useful generalization of partition would be something like the sortBy or sortOn functions we have in Haskell. In Cryptol they could have types like this:

sortBy : {a, n} (fin n) => (a -> a -> Bit) -> [n]a -> [n]a
sortOn : {a, b, n} (Cmp b, fin n) => (a -> b) -> [n]a -> [n]a
sortOn f = sortBy (\x y -> f x <= f y)

An ordinary sort function could be defined in terms of either of these:

sort : {a, n} (Cmp a, fin n) => [n]a -> [n]a
sort = sortBy (<=)
sort = sortOn (\x -> x)

When used with type b = Bit, sortOn would basically be your partition function, except that it would put the False elements first (because False < True). If you want the True elements at the front, you could define partition p = sortOn (\x -> ~ p x).

If we implement the sorting functions using a sorting network, they would probably work reasonably well even on symbolic inputs.

A sorting algorithm would be particularly nice to have in the standard library because it is a pretty tricky thing for users to implement themselves in Cryptol.

robdockins · 2021-01-13T01:36:01Z

Humm, yeah, a sorting primitive is an interesting idea. I think sortBy is basically the right primitive in that case, too. Do sorting networks result in stable sorts? That seems like a useful property for something like partition. You'd probably want elements to occur in the same relative order as the inputs.

weaversa · 2021-01-13T01:41:35Z

I like the idea of the sortBy primitive. And, yes, it would be nice if it was stable. Though, if the choice is between stable and symbolically terminating, I would choose the latter.

brianhuffman · 2021-01-13T03:21:45Z

If we provide a sortBy function, I definitely think it should be a stable sort.

Unfortunately the most efficient generic sorting networks like bitonic mergesort are not generally stable, but you can add a bit of extra structure to make them stable (by pairing each value with its index and including that in the comparisons). Another alternative would be to use an insertion-sort-style sorting network, which only does compare-and-swap of adjacent values, and is therefore stable. The algorithm would be quite simple to specify too. However, it would have worse asymptotics (O(n^2) swaps compared to O(n log^2 n) for bitonic mergesort).

brianhuffman · 2021-01-13T04:10:24Z

Here's an insertion sort that I believe to be stable:

sortBy : {a, n} (fin n) => (a -> a -> Bit) -> [n]a -> [n]a
sortBy le xs =
  if `n == (0 : Integer) then xs
  else drop`{1 - min 1 n} (insertBy le (xs@0) (sortBy le (drop`{min 1 n} xs)))

insertBy : {a, n} (fin n) => (a -> a -> Bit) -> a -> [n]a -> [n+1]a
insertBy le x0 ys = xys.0 # [last xs]
  where
    xs : [n+1]a
    xs = [x0] # xys.1
    xys : [n](a, a)
    xys = [ if le x y then (x, y) else (y, x) | x <- xs | y <- ys ]

The implementation would be a bit nicer if we implemented #701.

robdockins · 2021-01-13T18:28:43Z

I guess the interesting question to answer is "which formulation produces better symbolic formulae?" I suspect a bitonic sort with comparisons incorporating the original position will be slightly better, as the comparisons on positions will all start concrete, and the generated mux trees over positions will only depend on the results of comparisons made earlier in the sorting network.

brianhuffman · 2021-01-13T19:27:51Z

Using the above definition of sortBy (and a sort defined in terms of it) I was wondering whether it's actually possible to prove anything about it with symbolic inputs. Z3 actually did better than I expected:

sort> :prove \(xs:[8]Integer) -> sort xs == sort (reverse xs)
Q.E.D.
(Total Elapsed Time: 3.666s, using "Z3")

Setting prover=offline shows that this query sends about 300 lines of SMTLib to the solver, basically nothing but integer comparisons and ite.

I could also prove that the first and last elements are the right way around:

sort> :prove \(xs:[16]Integer) -> head (sort xs) <= last (sort xs)
Q.E.D.
(Total Elapsed Time: 0.915s, using "Z3")

Most properties about sort don't really scale very well, though, as you might expect:

sort> :prove \(xs:[6]Integer) -> sum (sort xs) == sum xs
Q.E.D.
(Total Elapsed Time: 0.333s, using "Z3")
sort> :prove \(xs:[7]Integer) -> sum (sort xs) == sum xs
Q.E.D.
(Total Elapsed Time: 3.341s, using "Z3")
sort> :prove \(xs:[8]Integer) -> sum (sort xs) == sum xs
Q.E.D.
(Total Elapsed Time: 46.215s, using "Z3")

So it seems that the insertion-sort-style sorting network is at least somewhat reasonable for symbolic stuff. The performance on concrete input is not so great though; sorting a list of 1000 integers takes about 8 or 9 seconds on my machine, and that grows quadratically with the length of the list.

brianhuffman · 2021-01-13T19:30:18Z

Here's another one that I thought Z3 did surprisingly well on:

sort> :prove \(xs:[16]Integer) -> sort (sort xs) == sort xs
Q.E.D.
(Total Elapsed Time: 9.704s, using "Z3")

weaversa · 2021-01-13T19:38:23Z

I got this one to go through...

Main> :prove \(xs : [20]Integer) (ys : [20]Integer) -> [x >= 0 /\ x < 20 | x <- xs] == ~0 ==> [ yi <= yj | yi <- sort (ys @@ xs) | yj <- tail (sort (ys @@ xs)) ] == ~0
Q.E.D.
(Total Elapsed Time: 7.818s, using "Z3")

WeeknightMVP · 2021-01-15T03:37:26Z

mergesort.zip

Here's a mergesort for kicks and giggles, along with some properties of sorts. Unfortunately, this version performs more poorly on most of the same properties as tried for the insertion sort earlier:

mergesort> let sort = mergesorted_ (<=)
mergesort> :prove \(xs:[8]Integer) -> sort xs == sort (reverse xs)
Q.E.D.
(Total Elapsed Time: 3.850s, using "Z3")
mergesort> :prove \(xs:[8]Integer) -> head (sort xs) <= last (sort xs)
Q.E.D.
(Total Elapsed Time: 0.195s, using "Z3")
mergesort> :prove \(xs:[12]Integer) -> head (sort xs) <= last (sort xs)
Q.E.D.
(Total Elapsed Time: 1.839s, using "Z3")
mergesort> :prove \(xs:[7]Integer) -> sum (sort xs) == sum xs
Q.E.D.
(Total Elapsed Time: 4.172s, using "Z3")
mergesort> :prove \(xs:[10]Integer) -> sort (sort xs) == sort xs
Q.E.D.
(Total Elapsed Time: 8.097s, using "Z3")

Here's a runthrough of some more explicit sort properties:

mergesort> :prove is_sort`{6,Integer} mergesorted_ (<=)
Q.E.D.
(Total Elapsed Time: 1.844s, using "Z3")
mergesort> :prove is_sort`{8,Integer} mergesorted_ (<=)
Q.E.D.
(Total Elapsed Time: 31.755s, using "Z3")
mergesort> :prove is_mergesorted_a_stable_sort`{4,(Char, Integer)} (\x -> \y -> x.0 <= y.0)
Q.E.D.
(Total Elapsed Time: 1.737s, using "Z3")
mergesort> :prove is_mergesorted_a_stable_sort`{6,(Char, Integer)} (\x -> \y -> x.0 <= y.0)
Q.E.D.
(Total Elapsed Time: 1m:29.685s, using "Z3")

Specifying in Cryptol whether an arbitrary sort is stable (w.r.t. its comparator) has thus far eluded me...

WeeknightMVP · 2021-01-15T20:19:09Z

mergesort.zip

OK, this attachment defines stable sorts, orders, and some utility functions in distinct modules. mergesorted_ id (<=) is stable, though this proof scales even more poorly...

mergesort> :prove is_stable_sort`{4,Integer} mergesorted_ id (<=)
Q.E.D.
(Total Elapsed Time: 0.189s, using "Z3")
mergesort> :prove is_stable_sort`{6,Integer} mergesorted_ id (<=)
Q.E.D.
(Total Elapsed Time: 4.577s, using "Z3")
mergesort> :prove is_stable_sort`{7,Integer} mergesorted_ id (<=)
Q.E.D.
(Total Elapsed Time: 38.780s, using "Z3")
mergesort> :prove is_stable_sort`{8,Integer} mergesorted_ id (<=)
...^C

robdockins · 2021-03-23T19:23:30Z

Here's an insertion sort that I believe to be stable:

sortBy : {a, n} (fin n) => (a -> a -> Bit) -> [n]a -> [n]a
sortBy le xs =
  if `n == (0 : Integer) then xs
  else drop`{1 - min 1 n} (insertBy le (xs@0) (sortBy le (drop`{min 1 n} xs)))

insertBy : {a, n} (fin n) => (a -> a -> Bit) -> a -> [n]a -> [n+1]a
insertBy le x0 ys = xys.0 # [last xs]
  where
    xs : [n+1]a
    xs = [x0] # xys.1
    xys : [n](a, a)
    xys = [ if le x y then (x, y) else (y, x) | x <- xs | y <- ys ]

The implementation would be a bit nicer if we implemented #701.

Should we add basically this implementation to the Prelude and later consider the question of making it a primitive?

msaaltink · 2021-03-25T02:30:52Z

The original post asked it would be nice to have some primitives that give users the "feeling" of having variable length sequences. It is interesting to see if we can take that seriously.

Here's a module for lists:

module listOps where

type list a = [inf](Bool, a)

nil: {a} (Zero a) => list a
nil = [(False, zero)] # nil

cons: {a} a -> list a -> list a
cons x y = [(True, x)] # y

null: {a} list a -> Bit
null x = ~a where (a,_) = x @ 0

list_case: {a,b} (a-> list a -> b) -> b -> list a -> b
list_case f_cons v_nil xs = if is_cons then f_cons x rest else v_nil
  where
    (is_cons, x) = xs@0
    rest = drop`{1} xs
// list_case f b nil => b 
// list_case f b (cons x y) => f x y

list_foldr: {a,b} (a -> b -> b) -> b -> list a -> b
list_foldr f b lst = (if is_cons then f v (list_foldr f b (tail lst)) else b) where
  (is_cons, v) = lst@0
// list_foldr f b nil = b
// list_foldr f b (cons x xs) = f x (list_foldr f b xs)

list_foldl: {a,b} (b -> a -> b) -> b -> list a -> b
list_foldl f b lst = (if is_cons then list_foldl f (f b v) (tail lst) else b) where
  (is_cons, v) = lst@0
// list_foldl f b nil = b
// list_foldl f b (cons x xs) = list_foldl f (f b x) xs

// getting to and from fixed-length sequences

to_list: {n,t} (fin n, Zero t) => [n]t -> list t
to_list xs = [(True, x) | x <- xs] # nil

from_list: {a,n} (fin n) => list a -> [n]a
from_list lst = [x | (_,x) <- take lst]

Given that, here's an insertion sort:

import listOps

ins: {t} (Cmp t, Zero t) => t -> list t -> list t
ins x ys = list_case (\ y rest -> if x <= y then  cons x ys else cons y (ins x rest))
                     (cons x nil)
                     ys

ins_sort_list = list_foldr insert nil

ins_sort: {n,t} (fin n, Cmp t, Zero t) => [n]t -> [n]t
ins_sort xs = from_list (ins_sort_list (to_list xs))

Or, with even more of a lispy feel, here' a merge sort

car = list_case ( \ x _ -> x) (error "no car")
cdr = list_case ( \ _ xs -> xs) nil

// ... split _list divides into two  parts, roughly equal in size
// ... split_list '(1 2 3) = ('(2),  '(3 1))
split_list: {t} (Zero t) => list t -> (list t, list t)
split_list xs = split' xs nil nil where
  split' ys a1 a2 = if null ys then (a1,a2) else split' (cdr ys) a2 (cons (car ys) a1)

merge xs ys =
  if null xs then ys
   | null ys then xs
   | car xs <= car ys then cons (car xs) (merge (cdr xs) ys)
   else cons (car ys) (merge xs (cdr ys))

is_small_list x = null x \/ null (cdr x)

merge_sort_list: {t} (Zero t, Cmp t) => list t -> list t
merge_sort_list xs =
  if is_small_list xs
  then xs
  else merge (merge_sort_list a1) (merge_sort_list a2) where (a1,a2) = split_list xs

merge_sort: {n,t} (fin n, Cmp t, Zero t) => [n]t -> [n]t
merge_sort xs = from_list (merge_sort_list (to_list xs))

The merge sort is more efficient at longer lists.

list_sort> :prove merge_sort (reverse [0 .. 299]) == [0 .. 299]
Q.E.D.
(Total Elapsed Time: 11.490s, using "Z3")
list_sort> :prove ins_sort (reverse [0 .. 299]) == [0 .. 299]
Q.E.D.
(Total Elapsed Time: 24.309s, using "Z3")

And we can still prove properties for fixed-length inputs.

list_sort> :prove \(xs:[6]Integer) -> merge_sort xs == ins_sort xs
Q.E.D.
(Total Elapsed Time: 0.495s, using "Z3")
list_sort> :prove \(xs:[8]Integer) -> merge_sort xs == ins_sort xs
Q.E.D.
(Total Elapsed Time: 3.263s, using "Z3")
list_sort> :prove \(xs:[6]Integer) -> merge_sort xs == merge_sort (merge_sort xs)
Q.E.D.
(Total Elapsed Time: 0.487s, using "Z3")
list_sort> :prove \(xs:[6]Integer) -> ins_sort xs == ins_sort (ins_sort xs)
Q.E.D.
(Total Elapsed Time: 0.407s, using "Z3")
list_sort> :prove \(xs:[8]Integer) -> merge_sort xs == merge_sort (reverse xs)
Q.E.D.
(Total Elapsed Time: 2.563s, using "Z3")

Perhaps this is more of the "feel" that @weaversa was looking for?

weaversa · 2021-03-25T02:46:03Z

Thanks @msaaltink! I like the list type using inf. These are definitely in line with what I had in mind. It's often impractical to ask the type system to manage sizes of variable length sequences. Here are a few more functions I've found useful:

empty = zero  // A tuple here would allow one to use zero as an element

push : {a} [inf]a -> a -> [inf]a
push s x = [x] # s

pop : {a} [inf]a -> (a, [inf]a)
pop s = (head s, tail s)

Main> foldl push empty [5, 8, 10, 11]
Showing a specific instance of polymorphic result:
  * Using 'Integer' for type of sequence member
[11, 10, 8, 5, 0, ...]

Here's also a nice dictionary data structure (showed to me by @LeventErkok 11ish years ago):

dictionary d k v = \q -> if q==k then v else d q

Main> (dictionary (dictionary empty "key1" 41) "key2" 89) "key2"
Showing a specific instance of polymorphic result:
  * Using 'Integer' for 2nd type argument of 'Main::dictionary'
41
Main> (dictionary (dictionary empty "key1" 41) "key2" 89) "key2"
Showing a specific instance of polymorphic result:
  * Using 'Integer' for 2nd type argument of 'Main::dictionary'
89

linesthatinterlace · 2021-05-10T17:22:40Z

If people might be interested, I've implemented djbsort (https://sorting.cr.yp.to/) for 32-bit signed integers in Cryptol (it's basically a sorting network). Just found this issue...

This was relevant in the context of Classic McEliece, which uses djbsort (and permutation networks) in its key generation.

brianhuffman added the algorithms in Cryptol Requires writing Cryptol code, rather than modifying the implementation's source code label Jan 13, 2021

robdockins changed the title ~~Prelude candidates~~ Sorting primitives Jan 15, 2021

weaversa mentioned this issue May 4, 2021

> Here's an insertion sort that I believe to be stable: #1190

Closed

brianhuffman linked a pull request May 6, 2021 that will close this issue

Sort function #1193

Merged

brianhuffman closed this as completed in #1193 May 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sorting primitives #1032

Sorting primitives #1032

weaversa commented Jan 12, 2021

brianhuffman commented Jan 13, 2021 •

edited

Loading

robdockins commented Jan 13, 2021

weaversa commented Jan 13, 2021

brianhuffman commented Jan 13, 2021

brianhuffman commented Jan 13, 2021

robdockins commented Jan 13, 2021

brianhuffman commented Jan 13, 2021

brianhuffman commented Jan 13, 2021

weaversa commented Jan 13, 2021 •

edited

Loading

WeeknightMVP commented Jan 15, 2021

WeeknightMVP commented Jan 15, 2021 •

edited

Loading

robdockins commented Mar 23, 2021

msaaltink commented Mar 25, 2021 •

edited

Loading

weaversa commented Mar 25, 2021

linesthatinterlace commented May 10, 2021

Sorting primitives #1032

Sorting primitives #1032

Comments

weaversa commented Jan 12, 2021

brianhuffman commented Jan 13, 2021 • edited Loading

robdockins commented Jan 13, 2021

weaversa commented Jan 13, 2021

brianhuffman commented Jan 13, 2021

brianhuffman commented Jan 13, 2021

robdockins commented Jan 13, 2021

brianhuffman commented Jan 13, 2021

brianhuffman commented Jan 13, 2021

weaversa commented Jan 13, 2021 • edited Loading

WeeknightMVP commented Jan 15, 2021

WeeknightMVP commented Jan 15, 2021 • edited Loading

robdockins commented Mar 23, 2021

msaaltink commented Mar 25, 2021 • edited Loading

weaversa commented Mar 25, 2021

linesthatinterlace commented May 10, 2021

brianhuffman commented Jan 13, 2021 •

edited

Loading

weaversa commented Jan 13, 2021 •

edited

Loading

WeeknightMVP commented Jan 15, 2021 •

edited

Loading

msaaltink commented Mar 25, 2021 •

edited

Loading