Add backfill benchmarks #412

ryanslade · 2024-10-17T13:51:29Z

This change adds a benchmark that run against 10k, 100k and 1 million rows.

They benchmark:

How long it takes to complete a full back fill of a single column
How long it takes to update all rows in a table with and without a migration trigger in place

This should give us a baseline metric that we can use to compare performance over time.

Example output:

make bench 
go test ./internal/benchmarks -v -benchtime=1x -bench .
2024/10/21 12:44:01 github.com/testcontainers/testcontainers-go - Connected to docker: 
  Server Version: 27.2.0
  API Version: 1.46
  Operating System: Docker Desktop
  Total Memory: 7838 MB
  Labels:
    com.docker.desktop.address=unix:///Users/ryan/Library/Containers/com.docker.docker/Data/docker-cli.sock
  Testcontainers for Go Version: v0.33.0
  Resolved Docker Host: unix:///var/run/docker.sock
  Resolved Docker Socket Path: /var/run/docker.sock
  Test SessionID: 816adaef777204b01d23a061c6f5532ca8cea098c7f8c6a68fdf542fbfa73f6e
  Test ProcessID: bf2f6095-b21e-4569-a4df-52291606bf3d
2024/10/21 12:44:01 🐳 Creating container for image testcontainers/ryuk:0.8.1
2024/10/21 12:44:01 ✅ Container created: eab8b6af62ba
2024/10/21 12:44:01 🐳 Starting container: eab8b6af62ba
2024/10/21 12:44:01 ✅ Container started: eab8b6af62ba
2024/10/21 12:44:01 ⏳ Waiting for container id eab8b6af62ba image: testcontainers/ryuk:0.8.1. Waiting for: &{Port:8080/tcp timeout:<nil> PollInterval:100ms skipInternalCheck:false}
2024/10/21 12:44:01 🔔 Container is ready: eab8b6af62ba
2024/10/21 12:44:01 🐳 Creating container for image postgres:15.3
2024/10/21 12:44:01 ✅ Container created: 7bc6dfd7af00
2024/10/21 12:44:01 🐳 Starting container: 7bc6dfd7af00
2024/10/21 12:44:01 ✅ Container started: 7bc6dfd7af00
2024/10/21 12:44:01 ⏳ Waiting for container id 7bc6dfd7af00 image: postgres:15.3. Waiting for: &{timeout:<nil> deadline:0x14000435060 Strategies:[0x14000460540]}
2024/10/21 12:44:02 🔔 Container is ready: 7bc6dfd7af00
goos: darwin
goarch: arm64
pkg: github.com/xataio/pgroll/internal/benchmarks
cpu: Apple M2 Pro
BenchmarkBackfill
BenchmarkBackfill/10000
    benchmarks_test.go:136: Seeded 10000 rows in 19.073458ms (524289 rows/s)
    benchmarks_test.go:51: Backfilled 10000 rows in 102.083958ms
BenchmarkBackfill/10000-10        	      1	102083958 ns/op	    97959 rows/s
BenchmarkBackfill/100000
    benchmarks_test.go:136: Seeded 100000 rows in 96.639042ms (1034778 rows/s)
    benchmarks_test.go:51: Backfilled 100000 rows in 2.032871959s
BenchmarkBackfill/100000-10       	      1	2032871959 ns/op	    49191 rows/s
BenchmarkBackfill/1000000
    benchmarks_test.go:136: Seeded 1000000 rows in 608.590708ms (1643140 rows/s)
    benchmarks_test.go:51: Backfilled 1000000 rows in 56.80506s
BenchmarkBackfill/1000000-10      	      1	56805060000 ns/op	    17604 rows/s
BenchmarkWriteAmplification
BenchmarkWriteAmplification/NoTrigger
BenchmarkWriteAmplification/NoTrigger/10000
    benchmarks_test.go:136: Seeded 10000 rows in 21.901875ms (456582 rows/s)
BenchmarkWriteAmplification/NoTrigger/10000-10   	      1	 15013333 ns/op	   666075 rows/s
BenchmarkWriteAmplification/NoTrigger/100000
    benchmarks_test.go:136: Seeded 100000 rows in 98.442458ms (1015822 rows/s)
BenchmarkWriteAmplification/NoTrigger/100000-10  	      1	155141667 ns/op	   644572 rows/s
BenchmarkWriteAmplification/NoTrigger/1000000
    benchmarks_test.go:136: Seeded 1000000 rows in 663.248542ms (1507730 rows/s)
BenchmarkWriteAmplification/NoTrigger/1000000-10 	      1	1704721875 ns/op	   586606 rows/s
BenchmarkWriteAmplification/WithTrigger
BenchmarkWriteAmplification/WithTrigger/10000
    benchmarks_test.go:136: Seeded 10000 rows in 26.146708ms (382457 rows/s)
BenchmarkWriteAmplification/WithTrigger/10000-10 	      1	 59703417 ns/op	   167495 rows/s
BenchmarkWriteAmplification/WithTrigger/100000
    benchmarks_test.go:136: Seeded 100000 rows in 102.552667ms (975109 rows/s)
BenchmarkWriteAmplification/WithTrigger/100000-10         	      1	630408666 ns/op	   158627 rows/s
BenchmarkWriteAmplification/WithTrigger/1000000
    benchmarks_test.go:136: Seeded 1000000 rows in 666.005167ms (1501490 rows/s)
BenchmarkWriteAmplification/WithTrigger/1000000-10        	      1	5909246000 ns/op	   169226 rows/s
PASS
2024/10/21 12:45:51 🐳 Terminating container: 7bc6dfd7af00
2024/10/21 12:45:51 🚫 Container terminated: 7bc6dfd7af00
ok  	github.com/xataio/pgroll/internal/benchmarks	110.632s

Part of #408

ryanslade · 2024-10-18T13:13:44Z

@andrew-farries I think is is fine for now.

What I'd like to do is add the other benchmarks in separate PR's and then once we have them all work on the job of persisting to S3.

For now, we can always check out branches locally and run the benchmarks to spot check performance improvements.

ryanslade · 2024-10-18T13:14:31Z

Also, as we add more benchmarks some pattern will probably emerge which will help pull out some duplicated code.

This is done in CI

ryanslade · 2024-10-21T10:34:11Z

I've just added tests for write amplification too. It does appear to be a bottleneck where having the triggers in place limits updates to around 15k row/s second on my machine compared with about 60k without the trigger.

andrew-farries

👍 with a couple of questions.

.github/workflows/benchmark.yaml

internal/benchmarks/benchmarks_test.go

.github/workflows/benchmark.yaml

This should still be enough for now and we can increase later if we need to.

ryanslade · 2024-10-29T11:19:10Z

I'll merge now to get some baseline runs in CI and then add some checks that fail the run in CI if it's slower than 30% of the current baseline.

ryanslade added 9 commits October 17, 2024 15:50

Initial stab at a simple backfill benchmark

176d669

Remove TODO

53b6644

Add license header

e04ee36

Print row/s seeded

642fe97

Add some comments

4af35f5

Casing tweak

76bf742

Merge branch 'main' into rs/benchmark-scaffolding

9ce9a87

Merge branch 'main' into rs/benchmark-scaffolding

d08bd95

Run benchmarks on push to main

19601cc

ryanslade marked this pull request as ready for review October 18, 2024 13:12

ryanslade requested a review from andrew-farries October 18, 2024 13:12

ryanslade added 7 commits October 18, 2024 15:23

Temporarliy run benchmark action

1eeb68c

Run the correct make command

d27e958

Remove benchmark from build workflow

db57605

Merge branch 'main' into rs/benchmark-scaffolding

f5e74a5

Fix import path

3c9ea47

Remove TODO

4d46743

This is done in CI

Add benchmark for write amplification

dbe3cb1

ryanslade added 9 commits October 21, 2024 12:41

Move cleanup code into cleanup function

e327d56

Format and switch from assert to require

956e289

Merge branch 'main' into rs/benchmark-scaffolding

a54759e

Merge branch 'main' into rs/benchmark-scaffolding

01562e9

Merge branch 'main' into rs/benchmark-scaffolding

0930d8d

Merge branch 'main' into rs/benchmark-scaffolding

f1ab93d

Merge branch 'main' into rs/benchmark-scaffolding

df2bd12

Merge branch 'main' into rs/benchmark-scaffolding

7f6016a

Merge branch 'main' into rs/benchmark-scaffolding

b875daa

Merge branch 'main' into rs/benchmark-scaffolding

e41db77

andrew-farries approved these changes Oct 28, 2024

View reviewed changes

.github/workflows/benchmark.yaml Outdated Show resolved Hide resolved

internal/benchmarks/benchmarks_test.go Outdated Show resolved Hide resolved

.github/workflows/benchmark.yaml Show resolved Hide resolved

ryanslade added 5 commits October 28, 2024 16:57

Don't logs seeding row/s

cf96799

Update permissions

4552b43

Merge branch 'main' into rs/benchmark-scaffolding

655a7e4

Makefile comment

bb7c6db

Drop max benchmarked rows to 300k

656632a

This should still be enough for now and we can increase later if we need to.

ryanslade merged commit b6f76c7 into main Oct 29, 2024
25 checks passed

ryanslade deleted the rs/benchmark-scaffolding branch October 29, 2024 11:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add backfill benchmarks #412

Add backfill benchmarks #412

ryanslade commented Oct 17, 2024 •

edited

Loading

ryanslade commented Oct 18, 2024

ryanslade commented Oct 18, 2024

ryanslade commented Oct 21, 2024 •

edited

Loading

andrew-farries left a comment

ryanslade commented Oct 29, 2024

Add backfill benchmarks #412

Add backfill benchmarks #412

Conversation

ryanslade commented Oct 17, 2024 • edited Loading

ryanslade commented Oct 18, 2024

ryanslade commented Oct 18, 2024

ryanslade commented Oct 21, 2024 • edited Loading

andrew-farries left a comment

Choose a reason for hiding this comment

ryanslade commented Oct 29, 2024

ryanslade commented Oct 17, 2024 •

edited

Loading

ryanslade commented Oct 21, 2024 •

edited

Loading