-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[extension/filestorage] Change bbolt DB settings for improved performance #9004
[extension/filestorage] Change bbolt DB settings for improved performance #9004
Conversation
@swiatekm-sumo I think this could use a changelog entry @djaglowski could you take a look? |
94c0523
to
9b90e3b
Compare
Looks like one of the compaction tests fails due to disabling freelist sync. @sumo-drosiek you're the author, is there a reason the setup for that test (
|
This is wrote that way due to the following issue:
So I made it slightly complicated to ensure that the compaction is strictly related to deletion of entries Also I wanted to avoid using magic numbers |
Anyway, I'm fine with modification if the compaction doesn't really behaves as I thought |
I looked at the values, and it seems like bbolt allocates disk space in 32Kb pages and doesn't compact to values smaller than that. However, with freelist syncing off, it starts off at 20Kb and goes to 32Kb after the first insert. This is why the test breaks. I could fix it by inserting some values first to make it fully allocate the first page, but at that point I'd prefer to make it much simpler with a magic number. |
9b90e3b
to
cd7ef98
Compare
Just please add comments, why the magic numbers are being used and link this discussion |
I know there are some flaky testbed tests, but I don't recall this one failing any time recently, so it may be related to this PR. Does a minor increase in RAM make sense with this change? If so, and assuming it's acceptable, we should bump this limit as part of this PR. |
Overall this change makes sense to me and the changes look good. I agree w/ the following comment though:
|
It's possible that the freelist uses more RAM as a hashmap vs an array. I'll do a simple benchmark to verify. |
@swiatekm-sumo, I spoke too soon. I just observed the same failure on another PR. Looks like it's not due to this change. |
6799b33
to
86174ff
Compare
Yeah, probably not. The freelist size increases by a little over 2% in the benchmark. I'd actually committed the changes with the line disabling freelist sync commented out by accident - with that fixed, E2E tests didn't show any memory problems - the particular test you're referencing used ~70 MB, which makes sense because we're now skipping allocations for serializing the freelist to disk. In any case, I've rebased my changes and added a comment about the magic numbers in the compaction test. Let me know if you need me to do anything else. |
bbolt doesn't automatically reclaim unused disk space - it expects the user to manually compact it. This has performance implications in that it keeps track of disk segments in a freelist structure, which can be large even if the db doesn't actually contain any data. By default, it also syncs this freelist to disk on every transaction. Disable freelist syncing and change freelist type to one which performs better on large dbs. The result is a massive reduction in the cost of writes to a large DB file. Include benchmarks to demonstrate this. Not syncing the freelist makes opening the DB more expensive, but this only happens on extension start, and the runtime remains under 1s even on 10 Gb db files.
86174ff
to
05040ac
Compare
Description:
Disable freelist syncing and switch to a freelist type that's more performant on larger DBs. See issue for a more detailed description.
Link to tracking Issue: Closes #9003
Testing:
Added two additional benchmarks for larger DBs. I've also manually verified that the changes are backwards-compatible.