-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compaction causes full reimaging #2948
Comments
And what do you think about my "partial image layers" approach (#2563) ? |
Do you have a text explanation that goes with the presentation? I get the concept of "delta with page images", but I'm missing a lot of specifics on changes to heuristics. Do we still take regular images after x deltas, per partition? |
…separate rel/non-rel and sys/user relation entries refer #2948
It seems to me that I have answered this question somewhere (may be in Slack?).
|
…aory image layer generation refer #2948
…aory image layer generation refer #2948
## Describe your changes This is yet another attempt to address problem with storage size ballooning #2948 Previous PR #3348 tries to address this problem by maintaining list of holes for each layer. The problem with this approach is that we have to load all layer on pageserver start. Lazy loading of layers is not possible any more. This PR tries to collect information of N largest holes on compaction time and exclude this holes from produced layers. It can cause generation of larger number of layers (up to 2 times) and producing small layers. But it requires minimal changes in code and doesn't affect storage format. For graphical explanation please see thread: #3597 (comment) ## Issue ticket number and link #2948 #3348 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.
Steps to reproduce
I haven't locally reproduced it, but I saw it happen in prod 9 times on the same project, odd-brook-545806. I used the timeline visualizer to spot the problem, highlighted in red
Expected result
Compaction makes L1 layers, which trigger image creation only where needed.
Actual result
Three little L1 layers cause the entire database to get covered in new image layers. As a result, we have 9 full unnecessary copies of the database.
Also this bug complicates perf testing, since reused projects will behave differently than fresh projects.
Environment
prod
Why is this happening?
We take an image when there are at least 3 L1 layers over the latest image. But every compaction creates 2 very long L1 layers that span all the way to the left and right of the keyspace, because there are some nonrel keys in those parts of the keyspace that get updated often.
Proposal for solution
Instead of creating images, include these long and sparse delta layers in the next compaction round. Basically treat them as L0 layers. Redefine the definition of L1 to be "a sufficiently dense layer, regardless of how many times it's been compacted".
The text was updated successfully, but these errors were encountered: