Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backfill benchmarks #412

Merged
merged 31 commits into from
Oct 29, 2024
Merged

Add backfill benchmarks #412

merged 31 commits into from
Oct 29, 2024

Conversation

ryanslade
Copy link
Collaborator

@ryanslade ryanslade commented Oct 17, 2024

This change adds a benchmark that run against 10k, 100k and 1 million rows.

They benchmark:

  • How long it takes to complete a full back fill of a single column
  • How long it takes to update all rows in a table with and without a migration trigger in place

This should give us a baseline metric that we can use to compare performance over time.

Example output:

make bench 
go test ./internal/benchmarks -v -benchtime=1x -bench .
2024/10/21 12:44:01 github.com/testcontainers/testcontainers-go - Connected to docker: 
  Server Version: 27.2.0
  API Version: 1.46
  Operating System: Docker Desktop
  Total Memory: 7838 MB
  Labels:
    com.docker.desktop.address=unix:///Users/ryan/Library/Containers/com.docker.docker/Data/docker-cli.sock
  Testcontainers for Go Version: v0.33.0
  Resolved Docker Host: unix:///var/run/docker.sock
  Resolved Docker Socket Path: /var/run/docker.sock
  Test SessionID: 816adaef777204b01d23a061c6f5532ca8cea098c7f8c6a68fdf542fbfa73f6e
  Test ProcessID: bf2f6095-b21e-4569-a4df-52291606bf3d
2024/10/21 12:44:01 🐳 Creating container for image testcontainers/ryuk:0.8.1
2024/10/21 12:44:01 ✅ Container created: eab8b6af62ba
2024/10/21 12:44:01 🐳 Starting container: eab8b6af62ba
2024/10/21 12:44:01 ✅ Container started: eab8b6af62ba
2024/10/21 12:44:01 ⏳ Waiting for container id eab8b6af62ba image: testcontainers/ryuk:0.8.1. Waiting for: &{Port:8080/tcp timeout:<nil> PollInterval:100ms skipInternalCheck:false}
2024/10/21 12:44:01 🔔 Container is ready: eab8b6af62ba
2024/10/21 12:44:01 🐳 Creating container for image postgres:15.3
2024/10/21 12:44:01 ✅ Container created: 7bc6dfd7af00
2024/10/21 12:44:01 🐳 Starting container: 7bc6dfd7af00
2024/10/21 12:44:01 ✅ Container started: 7bc6dfd7af00
2024/10/21 12:44:01 ⏳ Waiting for container id 7bc6dfd7af00 image: postgres:15.3. Waiting for: &{timeout:<nil> deadline:0x14000435060 Strategies:[0x14000460540]}
2024/10/21 12:44:02 🔔 Container is ready: 7bc6dfd7af00
goos: darwin
goarch: arm64
pkg: github.com/xataio/pgroll/internal/benchmarks
cpu: Apple M2 Pro
BenchmarkBackfill
BenchmarkBackfill/10000
    benchmarks_test.go:136: Seeded 10000 rows in 19.073458ms (524289 rows/s)
    benchmarks_test.go:51: Backfilled 10000 rows in 102.083958ms
BenchmarkBackfill/10000-10        	      1	102083958 ns/op	    97959 rows/s
BenchmarkBackfill/100000
    benchmarks_test.go:136: Seeded 100000 rows in 96.639042ms (1034778 rows/s)
    benchmarks_test.go:51: Backfilled 100000 rows in 2.032871959s
BenchmarkBackfill/100000-10       	      1	2032871959 ns/op	    49191 rows/s
BenchmarkBackfill/1000000
    benchmarks_test.go:136: Seeded 1000000 rows in 608.590708ms (1643140 rows/s)
    benchmarks_test.go:51: Backfilled 1000000 rows in 56.80506s
BenchmarkBackfill/1000000-10      	      1	56805060000 ns/op	    17604 rows/s
BenchmarkWriteAmplification
BenchmarkWriteAmplification/NoTrigger
BenchmarkWriteAmplification/NoTrigger/10000
    benchmarks_test.go:136: Seeded 10000 rows in 21.901875ms (456582 rows/s)
BenchmarkWriteAmplification/NoTrigger/10000-10   	      1	 15013333 ns/op	   666075 rows/s
BenchmarkWriteAmplification/NoTrigger/100000
    benchmarks_test.go:136: Seeded 100000 rows in 98.442458ms (1015822 rows/s)
BenchmarkWriteAmplification/NoTrigger/100000-10  	      1	155141667 ns/op	   644572 rows/s
BenchmarkWriteAmplification/NoTrigger/1000000
    benchmarks_test.go:136: Seeded 1000000 rows in 663.248542ms (1507730 rows/s)
BenchmarkWriteAmplification/NoTrigger/1000000-10 	      1	1704721875 ns/op	   586606 rows/s
BenchmarkWriteAmplification/WithTrigger
BenchmarkWriteAmplification/WithTrigger/10000
    benchmarks_test.go:136: Seeded 10000 rows in 26.146708ms (382457 rows/s)
BenchmarkWriteAmplification/WithTrigger/10000-10 	      1	 59703417 ns/op	   167495 rows/s
BenchmarkWriteAmplification/WithTrigger/100000
    benchmarks_test.go:136: Seeded 100000 rows in 102.552667ms (975109 rows/s)
BenchmarkWriteAmplification/WithTrigger/100000-10         	      1	630408666 ns/op	   158627 rows/s
BenchmarkWriteAmplification/WithTrigger/1000000
    benchmarks_test.go:136: Seeded 1000000 rows in 666.005167ms (1501490 rows/s)
BenchmarkWriteAmplification/WithTrigger/1000000-10        	      1	5909246000 ns/op	   169226 rows/s
PASS
2024/10/21 12:45:51 🐳 Terminating container: 7bc6dfd7af00
2024/10/21 12:45:51 🚫 Container terminated: 7bc6dfd7af00
ok  	github.com/xataio/pgroll/internal/benchmarks	110.632s

Part of #408

@ryanslade ryanslade marked this pull request as ready for review October 18, 2024 13:12
@ryanslade
Copy link
Collaborator Author

@andrew-farries I think is is fine for now.

What I'd like to do is add the other benchmarks in separate PR's and then once we have them all work on the job of persisting to S3.

For now, we can always check out branches locally and run the benchmarks to spot check performance improvements.

@ryanslade
Copy link
Collaborator Author

Also, as we add more benchmarks some pattern will probably emerge which will help pull out some duplicated code.

@ryanslade
Copy link
Collaborator Author

ryanslade commented Oct 21, 2024

I've just added tests for write amplification too. It does appear to be a bottleneck where having the triggers in place limits updates to around 15k row/s second on my machine compared with about 60k without the trigger.

Copy link
Collaborator

@andrew-farries andrew-farries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 with a couple of questions.

.github/workflows/benchmark.yaml Outdated Show resolved Hide resolved
internal/benchmarks/benchmarks_test.go Outdated Show resolved Hide resolved
.github/workflows/benchmark.yaml Show resolved Hide resolved
@ryanslade
Copy link
Collaborator Author

I'll merge now to get some baseline runs in CI and then add some checks that fail the run in CI if it's slower than 30% of the current baseline.

@ryanslade ryanslade merged commit b6f76c7 into main Oct 29, 2024
25 checks passed
@ryanslade ryanslade deleted the rs/benchmark-scaffolding branch October 29, 2024 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants