Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Benchmarking #83

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Conversation

andrewosh
Copy link
Collaborator

@andrewosh andrewosh commented Mar 21, 2018

Hey all,

Here's an initial stab at a benchmarking system that should help us get some solid numbers. Each benchmark is performed on 4 databases, with a customizable number of trials per benchmark (the default is 5). The initial set of databases is (and perhaps we want to add to this?):

  1. hyperdb on disk
  2. hyperdb in memory
  3. leveldb on disk
  4. leveldb in memory

The initial set of benchmarks are very simple: large batch writes, many single writes, and iterations over various subsets of a large db. The database has a single writer, and is entirely local. This set will surely need to be expanded to reflect real-world use-cases.

Speaking of real-world use-cases, all the data so far is randomly generated. @mafintosh suggested a dictionary as a more realistic dataset. Any other ideas for fixtures?

At the end of benchmarking, results are dumped into CSV files in bench/stats. Here are some examples of what those look like, from a recent run:
https://github.com/andrewosh/hyperdb/blob/benchmarking-2/bench/stats/writes-random-data.csv
https://github.com/andrewosh/hyperdb/blob/benchmarking-2/bench/stats/reads-random-data.csv
(Timing is in nanoseconds, so some post-processing is required to make it readable).

A few of things of note:

  1. I'm currently using a modified version of nanobench because I started abusing it and I'm unsure if the changes I made should be reflected upstream. Before merging, that dependency (on my nanobench fork) will have to be changed.
  2. The generated prefixes in the current read tests (and reflected in the above benchmarks) aren't yet split into path components -- oops. Unsure if this will affect performance, but worth noting.
  3. Currently the maximum number of keys for any benchmark is 100k, since after that I'm getting consistent heap memory errors in the batch write.

@@ -20,14 +20,20 @@
"varint": "^5.0.0"
},
"devDependencies": {
"@andrewosh/nanobench": "^2.2.0",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two nanobench in the deps

@mafintosh
Copy link
Owner

@andrewosh whats missing for landing this? would be a cool addition

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants