-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create performance benchmarks for key pgroll
features
#408
Comments
I'd like to have a go at this. In a perfect world we'd probably want to run these against every commit, but I imagine they may take a while to run and I don't want to affect the velocity of getting things into Apart from actually writing the benchmarks, we need to decide on a few things:
Anything else? |
I think what you suggest is a good start. We want the benchmarks for a couple of reasons:
I suggest running the benchmarks as a separate workflow that is automatically run on changes to A consistent environment in terms of hardware and probably also software (maybe run the benchmarks in a container) is a must too. Results could be uploaded to object storage and pulled from there into our docs. |
This change adds a benchmark that run against 10k, 100k and 1 million rows. They benchmark: * How long it takes to complete a full back fill of a single column * How long it takes to update all rows in a table with and without a migration trigger in place This should give us a baseline metric that we can use to compare performance over time. Example output: ``` make bench go test ./internal/benchmarks -v -benchtime=1x -bench . 2024/10/21 12:44:01 github.com/testcontainers/testcontainers-go - Connected to docker: Server Version: 27.2.0 API Version: 1.46 Operating System: Docker Desktop Total Memory: 7838 MB Labels: com.docker.desktop.address=unix:///Users/ryan/Library/Containers/com.docker.docker/Data/docker-cli.sock Testcontainers for Go Version: v0.33.0 Resolved Docker Host: unix:///var/run/docker.sock Resolved Docker Socket Path: /var/run/docker.sock Test SessionID: 816adaef777204b01d23a061c6f5532ca8cea098c7f8c6a68fdf542fbfa73f6e Test ProcessID: bf2f6095-b21e-4569-a4df-52291606bf3d 2024/10/21 12:44:01 🐳 Creating container for image testcontainers/ryuk:0.8.1 2024/10/21 12:44:01 ✅ Container created: eab8b6af62ba 2024/10/21 12:44:01 🐳 Starting container: eab8b6af62ba 2024/10/21 12:44:01 ✅ Container started: eab8b6af62ba 2024/10/21 12:44:01 ⏳ Waiting for container id eab8b6af62ba image: testcontainers/ryuk:0.8.1. Waiting for: &{Port:8080/tcp timeout:<nil> PollInterval:100ms skipInternalCheck:false} 2024/10/21 12:44:01 🔔 Container is ready: eab8b6af62ba 2024/10/21 12:44:01 🐳 Creating container for image postgres:15.3 2024/10/21 12:44:01 ✅ Container created: 7bc6dfd7af00 2024/10/21 12:44:01 🐳 Starting container: 7bc6dfd7af00 2024/10/21 12:44:01 ✅ Container started: 7bc6dfd7af00 2024/10/21 12:44:01 ⏳ Waiting for container id 7bc6dfd7af00 image: postgres:15.3. Waiting for: &{timeout:<nil> deadline:0x14000435060 Strategies:[0x14000460540]} 2024/10/21 12:44:02 🔔 Container is ready: 7bc6dfd7af00 goos: darwin goarch: arm64 pkg: github.com/xataio/pgroll/internal/benchmarks cpu: Apple M2 Pro BenchmarkBackfill BenchmarkBackfill/10000 benchmarks_test.go:136: Seeded 10000 rows in 19.073458ms (524289 rows/s) benchmarks_test.go:51: Backfilled 10000 rows in 102.083958ms BenchmarkBackfill/10000-10 1 102083958 ns/op 97959 rows/s BenchmarkBackfill/100000 benchmarks_test.go:136: Seeded 100000 rows in 96.639042ms (1034778 rows/s) benchmarks_test.go:51: Backfilled 100000 rows in 2.032871959s BenchmarkBackfill/100000-10 1 2032871959 ns/op 49191 rows/s BenchmarkBackfill/1000000 benchmarks_test.go:136: Seeded 1000000 rows in 608.590708ms (1643140 rows/s) benchmarks_test.go:51: Backfilled 1000000 rows in 56.80506s BenchmarkBackfill/1000000-10 1 56805060000 ns/op 17604 rows/s BenchmarkWriteAmplification BenchmarkWriteAmplification/NoTrigger BenchmarkWriteAmplification/NoTrigger/10000 benchmarks_test.go:136: Seeded 10000 rows in 21.901875ms (456582 rows/s) BenchmarkWriteAmplification/NoTrigger/10000-10 1 15013333 ns/op 666075 rows/s BenchmarkWriteAmplification/NoTrigger/100000 benchmarks_test.go:136: Seeded 100000 rows in 98.442458ms (1015822 rows/s) BenchmarkWriteAmplification/NoTrigger/100000-10 1 155141667 ns/op 644572 rows/s BenchmarkWriteAmplification/NoTrigger/1000000 benchmarks_test.go:136: Seeded 1000000 rows in 663.248542ms (1507730 rows/s) BenchmarkWriteAmplification/NoTrigger/1000000-10 1 1704721875 ns/op 586606 rows/s BenchmarkWriteAmplification/WithTrigger BenchmarkWriteAmplification/WithTrigger/10000 benchmarks_test.go:136: Seeded 10000 rows in 26.146708ms (382457 rows/s) BenchmarkWriteAmplification/WithTrigger/10000-10 1 59703417 ns/op 167495 rows/s BenchmarkWriteAmplification/WithTrigger/100000 benchmarks_test.go:136: Seeded 100000 rows in 102.552667ms (975109 rows/s) BenchmarkWriteAmplification/WithTrigger/100000-10 1 630408666 ns/op 158627 rows/s BenchmarkWriteAmplification/WithTrigger/1000000 benchmarks_test.go:136: Seeded 1000000 rows in 666.005167ms (1501490 rows/s) BenchmarkWriteAmplification/WithTrigger/1000000-10 1 5909246000 ns/op 169226 rows/s PASS 2024/10/21 12:45:51 🐳 Terminating container: 7bc6dfd7af00 2024/10/21 12:45:51 🚫 Container terminated: 7bc6dfd7af00 ok github.com/xataio/pgroll/internal/benchmarks 110.632s ``` Part of #408
This uses a GitHub action to download benchmark data from S3. It then uses a static generator to create charts per benchmark / Postgres version. These results are then bundled up using Jekyll which publishes the README, available at: https://xataio.github.io/pgroll/ The benchmarks themselves can be see at https://xataio.github.io/pgroll/benchmarks.html (and are also linked from the README) In order to make this work and also not require users of pgroll as a library to import the charting library we needed to create a new module in `/dev`. We needed to duplicate the definition of the benchmark results structs since we couldn't import them from the main `pgroll` module without actually publishing a new version of the module. Once this module has been published it will be possible. Part of #408
Add a benchmark for the `read_schema` function which has caused regressions in the past. Part of #408
@andrew-farries @exekias The basic benchmarks are all done now, shall we close this? We can add new issues if we want to add more benchmarks, change how they're run etc. |
Yes, let's close it. |
Gather benchmarks data for the following parts of
pgroll
:UPDATE
heavy tables?read_schema
query, run on every DDL statement to capture 'inferred' migrations.Having these benchmarks in place would allow us to measure performance improvements over time and avoid regressions.
The text was updated successfully, but these errors were encountered: