Skip to content

Conversation

@westonpace
Copy link
Member

There is a rust and a python version. The python version is a little simpler and easier to run but the rust version has been helpful to me for profiling purposes.

We already have some regression benchmarks for latency but throughput can sometimes be different.

@westonpace
Copy link
Member Author

Scores on my local desktop:

ivf_pq_throughput/2.0_1threads_cached
                        time:   [502.18 ms 504.09 ms 505.86 ms]
                        thrpt:  [197.68  elem/s 198.38  elem/s 199.13  elem/s]
ivf_pq_throughput/2.0_1threads_nocache
                        time:   [1.1748 s 1.1769 s 1.1795 s]
                        thrpt:  [84.785  elem/s 84.969  elem/s 85.121  elem/s]
ivf_pq_throughput/2.0_16threads_cached
                        time:   [268.23 ms 270.15 ms 271.91 ms]
                        thrpt:  [367.76  elem/s 370.17  elem/s 372.82  elem/s]
ivf_pq_throughput/2.0_16threads_nocache
                        time:   [308.21 ms 312.46 ms 317.15 ms]
                        thrpt:  [315.31  elem/s 320.04  elem/s 324.46  elem/s]
ivf_pq_throughput/2.1_1threads_cached
                        time:   [513.79 ms 516.59 ms 518.99 ms]
                        thrpt:  [192.68  elem/s 193.58  elem/s 194.63  elem/s]
ivf_pq_throughput/2.1_1threads_nocache
                        time:   [1.1830 s 1.1846 s 1.1861 s]
                        thrpt:  [84.307  elem/s 84.419  elem/s 84.534  elem/s]
ivf_pq_throughput/2.1_16threads_cached
                        time:   [273.70 ms 275.06 ms 276.51 ms]
                        thrpt:  [361.65  elem/s 363.56  elem/s 365.37  elem/s]
ivf_pq_throughput/2.1_16threads_nocache
                        time:   [313.85 ms 317.14 ms 320.12 ms]
                        thrpt:  [312.38  elem/s 315.32  elem/s 318.62  elem/s]

@westonpace
Copy link
Member Author

westonpace commented Jan 6, 2026

The numbers are "queries per second". These queries are pretty I/O heavy. With nprobes=20, K=50, and refine_factor=10. So 500 IOPS are needed per query. Even so, the NVMe disk I'm testing against should be able to handle up to 500-800K IOPS/s so I'd expect a roofline around 1000-1600 QPS.

@codecov
Copy link

codecov bot commented Jan 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@westonpace westonpace force-pushed the perf/vector-throughput-benchmark branch from 49841ab to 109b3f2 Compare January 16, 2026 14:39
@westonpace westonpace force-pushed the perf/vector-throughput-benchmark branch from 07721e1 to 856b255 Compare January 23, 2026 03:08
@westonpace westonpace merged commit 2efb090 into lance-format:main Jan 23, 2026
26 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants