Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db_bench: allow restricting the range of keys for a read benchmark to the range of database keys #101

Closed
isaac-io opened this issue Aug 2, 2022 · 7 comments
Assignees
Labels
enhancement New feature or request performance
Milestone

Comments

@isaac-io
Copy link
Contributor

isaac-io commented Aug 2, 2022

Currently db_bench doesn't allow controlling the range of the keys that's being read during a read workload, so for the new paired bloom filter (#29) this causes the workload to bypass the filter completely in case the keys aren't in the range of the data in the database.

Add an option to restrict the key generation so that all of the keys are generated in the range during a read workload, so that the filter paths will be hit and we would be able to measure the impact of the changes in a real world scenario.

@udi-speedb
Copy link
Contributor

@isaac-io & I have discussed and agreed on the following:
We will add a new configuration parameter to db_bench. That parameter will allow the user to set the range of random keys to be used in benchmarks such as fillrandom and readrandom.
The default of the new parameter will have the range equal to the number of keys, which is the current behaviour => no change of behaviour by default.

@isaac-io
Copy link
Contributor Author

isaac-io commented Aug 22, 2022

Note that currently db_bench simply divides the key space between threads evenly, so care should be taken to divide the range between the threads, rather than the amount of keys for the benchmark as is done today.

EDIT: I seem to have confused db_bench and db_stress. db_bench doesn't need to track expected values, so it doesn't divide the key space between the threads as db_stress does.

@udi-speedb
Copy link
Contributor

Following a discussion with @isaac-io, it seems db_bench already has 2 existing parameters that users may use to achieve the same purpose: 'reads' / 'writes'. These parameters, when specified, control the number of keys (when not specified, the number of keys is set by the 'num' parameter.
So, a user may specify both 'num' and 'reads' / 'writes'. The 'num' will be used to control the range of keys and the 'reads' / 'writes', their number.

@isaac-io
Copy link
Contributor Author

Can we close this issue then? Should we run the paired bloom filter benchmark with these settings in order to ensure that it works before we close?

@udi-speedb
Copy link
Contributor

@erez-speedb - Could you please try to use these parameters and see if indeed these parameters enable us to get what we want?

@udi-speedb udi-speedb assigned erez-speedb and unassigned udi-speedb Sep 5, 2022
@isaac-io isaac-io added this to the v2.1.0 milestone Sep 21, 2022
@erez-speedb
Copy link

With num=$(($rows * 10000))
readrandom : 2.350 micros/op 1702456 ops/sec; 0.0 MB/s (1259 of 19126999 found)
With reads=$(($rows * 10000))
readrandom : 8.687 micros/op 460468 ops/sec; 75.5 MB/s (3279451 of 5186999 found)
@udi-speedb using the "-reads" flag is good enough and the test was updated accordingly.
Please consider reverting the db_bench change.

@isaac-io
Copy link
Contributor Author

Verified as working with the existing parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

No branches or pull requests

3 participants