-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stress test notes #139
Comments
Is a decent place to start. That found a couple issues:
|
Improved:
Take about 10 seconds to submit 500 tasks on my laptop. No hangups when I have things set up right:
|
Submitted 5000 tasks without trouble on my laptop. Took 1m32.777s |
1000 tasks on buchanan01 in exastack: 2m9.332s Not sure what the difference is here. File system access maybe? More CPUs mean more concurrent workers writing to the DB? There are a lot of possible factors. Theoretically, since boltdb locks the entire database for every transaction, the more clients reading/writing the database, the lower the performance. A database with row-level locking, a write-ahead log, compare-and-set, etc. would have much higher performance. |
#55 seems important, since I've repeatedly forgotten to use |
Another test command uses ping -w 30 to ping for 30 seconds, which puts some traffic on the task log streaming.
|
Testing with: 100 tasks takes 32 seconds |
At some point, the terminal dashboard starts lagging and becomes unusable. Probably because A) it's listing everything, and B) the server is busy communicating with workers (~20-30 tasks) |
The web dashboard seems to hold up fine in terms of interactivity. Of course, it's not nice to sort through 50 pages of completed tasks :) |
Ideas for improvement:
Ideas for more stress:
|
I have some boltDB test code here: https://gist.github.com/buchanae/38cc3c0ccb0a092417a14e7abdb4a0f8 Which helped me figure out that only a single update transaction can exist at a time, regardless of the bucket. |
Testing with: 1000 tasks: 1m 36s |
Running 1000 iterations: ~15s |
Amazingly, ~3.5 MB of data |
1 server, no workers on buchanan01 time for i in {1..1000}; do ./funnel run -S http://localhost:8000 --cpu 1 --cmd 'sleep 30'; done This command doesn't output any logs, reducing the write traffic to the database. 1000 tasks: 50s (~30-40s less than tasks with log traffic) |
Another idea for improvement: |
Trying out what happens when I set
|
I implemented a quick and dirty badger database backend, which looks 4x faster for creating 5000 tasks:
|
This task in JSON repeated 5000 times is ~ 1.8 MB
|
So we're switching to badger? |
I think so. I don't see a substantial downside yet. |
Hopefully the doing the new db driver will let you identify the common elements with the original boltdb, so doing more db plugins will be easier. |
Summary:
The comments below are for bolt DB:
Benchmark output: OpenStack, 12 CPUs
macbook
Google cloud VM, n1-standard-8 (8 vCPUs, 30 GB memory) with non-SSD
Google cloud VM, n1-standard-8 (8 vCPUs, 30 GB memory) with SSD
The tests below failed with
|
I'll keep some notes about stress test results here and then come up with a more concrete list of issues later.
The text was updated successfully, but these errors were encountered: