client/dcr: Improved UTXO selection algorithm #2169

martonp · 2023-02-24T20:27:12Z

This updates the algorithm used to find the UTXOs with the least over fund. It does 1000 random selections of UTXOs, and picks the most optimal one.

Inspired by this:
https://github.com/bitcoin/bitcoin/blob/3015e0bca6bc2cb8beb747873fdf7b80e74d679f/src/wallet.cpp#L1129

This updates the algorithm used to find the UTXOs with the least over fund. It does 1000 random selections of UTXOs, and picks the most optimal one.

chappjc · 2023-02-24T20:32:12Z

client/asset/dcr/coin_selection_test.go

-		})
-	}
-}
-
 func Test_leastOverFund(t *testing.T) {
 	amt := uint64(10e8)
 	newU := func(amt float64) *compositeUTXO {


Thanks for this!

I guess the test cases for the dumber leastOverFund were too easy? We chatted a bit about replacing the algo and it sounded like the new one got better results ~1/2 the time. Would it require much larger sets to test a case where it's a better result?

The better results ~1/2 the time was with the dynamic programming solution which only worked with very small amounts of UTXOs. When I tested this one with large amounts of UTXOs and compared it with the old solution, it got better results every single time.

The two solutions (this vs small and large bias subset) are equivalent, but the old one tries 2 random possibilities and this one tries 1000.

Tests well. ~16ms for a set of 4000 UTXOs between 0 and 2 DCR in search of a subset totaling 10DCR

func Test_subsetFund(t *testing.T) { amt := uint64(10e8) s := make([]*compositeUTXO, 4000) newU := func(amt float64) *compositeUTXO { return &compositeUTXO{ rpc: &walletjson.ListUnspentResult{Amount: amt}, } } for i := range s { v := rand.Float64() + float64(rand.Int63n(2)) v = math.Round(v*1e8) / 1e8 fmt.Println(v) s[i] = newU(v) } sort.Slice(s, func(i, j int) bool { return s[i].rpc.Amount < s[i].rpc.Amount }) r := subsetWithLeastSumGreaterThan(amt, s) fmt.Println(len(r), sumUTXOs(r)) }

client/asset/dcr/coin_selection.go

chappjc · 2023-02-24T23:11:14Z

client/asset/dcr/coin_selection.go

+	iterations := 1000
+	for nRep := 0; nRep < iterations; nRep++ {
+		included := make([]bool, len(utxos))


This is the main bottleneck, causing ~2000 allocs per call. It may seem silly, but if we only allocate once and instead zero out the slice after each iteration, it screams. Copying memory is much faster than allocating memory.

diff --git a/client/asset/dcr/coin_selection.go b/client/asset/dcr/coin_selection.go index af90c5a79..2be8a5dbc 100644 --- a/client/asset/dcr/coin_selection.go +++ b/client/asset/dcr/coin_selection.go @@ -80,21 +80,24 @@ func sumUTXOs(set []*compositeUTXO) (tot uint64) { // unused UTXOs until the total value is greater than or equal to amt. func subsetWithLeastSumGreaterThan(amt uint64, utxos []*compositeUTXO) []*compositeUTXO { best := uint64(1 << 62) - var bestIncluded *[]bool + var bestIncluded []bool bestNumIncluded := 0 - iterations := 1000 + rnd := rand.New(rand.NewSource(rand.Int63())) + + included := make([]bool, len(utxos)) + + const iterations = 1000 for nRep := 0; nRep < iterations; nRep++ { - included := make([]bool, len(utxos)) - var found bool var nTotal uint64 var numIncluded int - for nPass := 0; nPass < 2 && !found; nPass++ { + passes: + for nPass := 0; nPass < 2; nPass++ { for i := 0; i < len(utxos); i++ { var use bool if nPass == 0 { - use = rand.Uint32()&1 == 1 + use = rnd.Int63()&1 == 1 } else { use = !included[i] } @@ -105,15 +108,21 @@ func subsetWithLeastSumGreaterThan(amt uint64, utxos []*compositeUTXO) []*compos if nTotal >= amt { if nTotal < best || (nTotal == best && numIncluded < bestNumIncluded) { best = nTotal - bestIncluded = &included + if bestIncluded == nil { + bestIncluded = make([]bool, len(utxos)) + } + copy(bestIncluded, included) bestNumIncluded = numIncluded - found = true + break passes // next iter } - break + break // next pass } } } } + for i := range included { + included[i] = false + } } if bestIncluded == nil { @@ -121,7 +130,7 @@ func subsetWithLeastSumGreaterThan(amt uint64, utxos []*compositeUTXO) []*compos } set := make([]*compositeUTXO, 0, len(utxos)) - for i, inc := range *bestIncluded { + for i, inc := range bestIncluded { if inc { set = append(set, utxos[i]) }

bench:

func Benchmark_subsetFund(b *testing.B) { amt := uint64(10e8) s := make([]*compositeUTXO, 4000) newU := func(amt float64) *compositeUTXO { return &compositeUTXO{ rpc: &walletjson.ListUnspentResult{Amount: amt}, } } for i := 0; i < b.N; i++ { b.StopTimer() for i := range s { v := rand.Float64() + float64(rand.Int63n(2)) v = math.Round(v*1e8) / 1e8 s[i] = newU(v) } sort.Slice(s, func(i, j int) bool { return s[i].rpc.Amount < s[i].rpc.Amount }) b.StartTimer() leastOverFund(amt, s) } }

before

Benchmark_subsetFund Benchmark_subsetFund-32 727 1637138 ns/op 4152855 B/op 2001 allocs/op

after

Benchmark_subsetFund Benchmark_subsetFund-32 3734 332022 ns/op 46345 B/op 4 allocs/op

Related, a clear() builtin is coming to Go in 1.21.

https://tip.golang.org/ref/spec#Clear
https://go-review.googlesource.com/c/go/+/448076
golang/go#56351
golang/go@99bc53f

Nice, looks better. I copied everything you had, except I only put one break passes in the outer if block.

except I only put one break passes in the outer if block.

That would seem to change the original method then. I was just eliminating the found bool, but I think it behaves differently now.

It does seem to match the cpp now however. Was just a deviation before?

The deviation was because I was considering that the slice was sorted in our case. In the cpp their array is not sorted and if they go over the required amount, they remove the element and keep trying. This actually works much better, but is a bit slower.

I've created a commit here that compares the results and shows the runtime of the shuffled one: cc28045
I think we should go for the slower but more accurate one.

Seems strange to shuffle first, but OK, let's do that and follow the same approach of remove and keep trying if it goes over. This keep trying approach though would seem to make the best == amt break of the outer nRep loop more important though, no?

Hm yeah a best == amt break would definitely not hurt.

chappjc · 2023-02-26T00:52:55Z

client/asset/dcr/coin_selection.go

+	rnd := rand.New(rand.NewSource(rand.Int63()))
+	included := make([]bool, len(utxos))
+	const iterations = 1000
+	for nRep := 0; nRep < iterations; nRep++ {


Only other thing from the cpp that might make sense is to break nreps (this outermost loop) if we happen to hit nTotal == amt.

chappjc · 2023-02-26T01:00:04Z

client/asset/dcr/coin_selection.go

+			for i := 0; i < len(utxos); i++ {
+				var use bool
+				if nPass == 0 {
+					use = rnd.Int63()&1 == 1
+				} else {
+					use = !included[i]
+				}
+				if use {


It still bugs me that there's no ternary operator in Go. The arguments against it are so lame.
We could do this, but it's less clear and concise: if (nPass == 0 && rnd.Int63()&1 == 1) || (nPass > 0 && !included[i]) {.
We'll never get a ternary and probably not a context.Merge either.

JoeGruffins

Looks good to me. I made a benchmark test that you can throw in the tests if it looks ok:

func BenchmarkLeastOverFund(b *testing.B) {
	// Same amounts every time.
	rnd := rand.New(rand.NewSource(1))
	utxos := make([]*compositeUTXO, 2_000)
	for i := range utxos {
		utxo := &compositeUTXO{
			rpc: &walletjson.ListUnspentResult{
				Amount: float64(1+rnd.Int31n(100)) / float64(1e8),
			},
		}
		utxos[i] = utxo
	}
	b.ResetTimer()
	for n := 0; n < b.N; n++ {
		leastOverFund(10_000, utxos)
	}
}

JoeGruffins · 2023-02-27T05:02:15Z

This is probably fine, but I noticed while writing the above test, if you allow zero amounts in the utxo values, they make it into the final utxos. I guess we can assume there will be no zero amounts however.

client/dcr: Improved UTXO selection algorithm

4f73efc

This updates the algorithm used to find the UTXOs with the least over fund. It does 1000 random selections of UTXOs, and picks the most optimal one.

chappjc reviewed Feb 24, 2023

View reviewed changes

chappjc added this to the 0.6 milestone Feb 24, 2023

Chappjc fixes

7a08ec4

chappjc reviewed Feb 26, 2023

View reviewed changes

JoeGruffins approved these changes Feb 27, 2023

View reviewed changes

Use shuffle

2a72c2b

martonp force-pushed the randomUTXOSelect branch from 09c771f to 2a72c2b Compare February 27, 2023 18:01

chappjc approved these changes Feb 27, 2023

View reviewed changes

chappjc merged commit 7363fe8 into decred:master Feb 28, 2023

chappjc added the bonds fidelity bonds label Mar 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

client/dcr: Improved UTXO selection algorithm #2169

client/dcr: Improved UTXO selection algorithm #2169

martonp commented Feb 24, 2023

chappjc Feb 24, 2023

martonp Feb 24, 2023

chappjc Feb 24, 2023

chappjc Feb 24, 2023 •

edited

Loading

chappjc Feb 24, 2023

martonp Feb 25, 2023

chappjc Feb 26, 2023

chappjc Feb 26, 2023

martonp Feb 26, 2023

chappjc Feb 27, 2023

martonp Feb 27, 2023

chappjc Feb 26, 2023

chappjc Feb 26, 2023

JoeGruffins left a comment

JoeGruffins commented Feb 27, 2023

client/dcr: Improved UTXO selection algorithm #2169

client/dcr: Improved UTXO selection algorithm #2169

Conversation

martonp commented Feb 24, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chappjc Feb 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JoeGruffins left a comment

Choose a reason for hiding this comment

JoeGruffins commented Feb 27, 2023

chappjc Feb 24, 2023 •

edited

Loading