Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extension/filestorage] Change bbolt DB settings for improved performance #9004

Conversation

swiatekm
Copy link
Contributor

@swiatekm swiatekm commented Apr 1, 2022

Description:
Disable freelist syncing and switch to a freelist type that's more performant on larger DBs. See issue for a more detailed description.

Link to tracking Issue: Closes #9003

Testing:
Added two additional benchmarks for larger DBs. I've also manually verified that the changes are backwards-compatible.

@swiatekm swiatekm requested a review from a team April 1, 2022 10:00
@swiatekm swiatekm requested a review from djaglowski as a code owner April 1, 2022 10:00
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Apr 1, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: swiatekm-sumo / name: Mikołaj Świątek (1eae153969153c407046df2b1681d6cc088a1477)

@pmm-sumo
Copy link
Contributor

pmm-sumo commented Apr 1, 2022

@swiatekm-sumo I think this could use a changelog entry

@djaglowski could you take a look?

@swiatekm swiatekm force-pushed the fix/filestorageextension/bbolt-settings branch 2 times, most recently from 94c0523 to 9b90e3b Compare April 1, 2022 10:52
@swiatekm
Copy link
Contributor Author

swiatekm commented Apr 1, 2022

Looks like one of the compaction tests fails due to disabling freelist sync. @sumo-drosiek you're the author, is there a reason the setup for that test (

// add data until database file changes size (we are checking compacted size)
) is so complicated? If I just put 1000 records in the DB, compact, and compare sizes, it passes.

@sumo-drosiek
Copy link
Member

This is wrote that way due to the following issue:

  1. I was pushing data to storage
  2. Storage size changed (up)
  3. I compacted without removing anything (storage size changed (down))

So I made it slightly complicated to ensure that the compaction is strictly related to deletion of entries

Also I wanted to avoid using magic numbers

@sumo-drosiek
Copy link
Member

Anyway, I'm fine with modification if the compaction doesn't really behaves as I thought

@swiatekm
Copy link
Contributor Author

swiatekm commented Apr 1, 2022

Anyway, I'm fine with modification if the compaction doesn't really behaves as I thought

I looked at the values, and it seems like bbolt allocates disk space in 32Kb pages and doesn't compact to values smaller than that. However, with freelist syncing off, it starts off at 20Kb and goes to 32Kb after the first insert. This is why the test breaks. I could fix it by inserting some values first to make it fully allocate the first page, but at that point I'd prefer to make it much simpler with a magic number.

@swiatekm swiatekm force-pushed the fix/filestorageextension/bbolt-settings branch from 9b90e3b to cd7ef98 Compare April 1, 2022 15:13
@sumo-drosiek
Copy link
Member

Just please add comments, why the magic numbers are being used and link this discussion

@djaglowski
Copy link
Member

Log10kDPS/filelog                       |FAIL  |      6s|    13.0|    13.7|         49|        102|     61600|         58700|RAM consumption is 102 MiB, max expected is 100 MiB

I know there are some flaky testbed tests, but I don't recall this one failing any time recently, so it may be related to this PR. Does a minor increase in RAM make sense with this change? If so, and assuming it's acceptable, we should bump this limit as part of this PR.

@djaglowski
Copy link
Member

Overall this change makes sense to me and the changes look good. I agree w/ the following comment though:

Just please add comments, why the magic numbers are being used and link this discussion

@swiatekm
Copy link
Contributor Author

swiatekm commented Apr 4, 2022

Log10kDPS/filelog                       |FAIL  |      6s|    13.0|    13.7|         49|        102|     61600|         58700|RAM consumption is 102 MiB, max expected is 100 MiB

I know there are some flaky testbed tests, but I don't recall this one failing any time recently, so it may be related to this PR. Does a minor increase in RAM make sense with this change? If so, and assuming it's acceptable, we should bump this limit as part of this PR.

It's possible that the freelist uses more RAM as a hashmap vs an array. I'll do a simple benchmark to verify.

@djaglowski
Copy link
Member

@swiatekm-sumo, I spoke too soon. I just observed the same failure on another PR. Looks like it's not due to this change.

@swiatekm swiatekm force-pushed the fix/filestorageextension/bbolt-settings branch from 6799b33 to 86174ff Compare April 4, 2022 16:09
@swiatekm
Copy link
Contributor Author

swiatekm commented Apr 4, 2022

@swiatekm-sumo, I spoke too soon. I just observed the same failure on another PR. Looks like it's not due to this change.

Yeah, probably not. The freelist size increases by a little over 2% in the benchmark. I'd actually committed the changes with the line disabling freelist sync commented out by accident - with that fixed, E2E tests didn't show any memory problems - the particular test you're referencing used ~70 MB, which makes sense because we're now skipping allocations for serializing the freelist to disk.

In any case, I've rebased my changes and added a comment about the magic numbers in the compaction test. Let me know if you need me to do anything else.

bbolt doesn't automatically reclaim unused disk space - it expects the
user to manually compact it. This has performance implications in that
it keeps track of disk segments in a freelist structure, which can be
large even if the db doesn't actually contain any data. By default, it
also syncs this freelist to disk on every transaction.

Disable freelist syncing and change freelist type to one which performs
better on large dbs. The result is a massive reduction in the cost of
writes to a large DB file. Include benchmarks to demonstrate this.

Not syncing the freelist makes opening the DB more expensive, but this
only happens on extension start, and the runtime remains under 1s even
on 10 Gb db files.
@swiatekm swiatekm force-pushed the fix/filestorageextension/bbolt-settings branch from 86174ff to 05040ac Compare April 5, 2022 08:17
@pmm-sumo pmm-sumo added the ready to merge Code review completed; ready to merge by maintainers label Apr 5, 2022
@codeboten codeboten merged commit 637d773 into open-telemetry:main Apr 5, 2022
@swiatekm swiatekm deleted the fix/filestorageextension/bbolt-settings branch April 6, 2022 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[extensions/file_storage] Change bbolt settings for better performance on large DB files
5 participants