-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Post initial tenant_size
issues
#2748
Comments
Tenant size information is gathered by using existing parts of `Tenant::gc_iteration` which are now separated as `Tenant::refresh_gc_info`. `Tenant::refresh_gc_info` collects branch points, and invokes `Timeline::update_gc_info`; nothing was supposed to be changed there. The gathered branch points (through Timeline's `GcInfo::retain_lsns`), `GcInfo::horizon_cutoff`, and `GcInfo::pitr_cutoff` are used to build up a Vec of updates fed into the `libs/tenant_size_model` to calculate the history size. The gathered information is now exposed using `GET /v1/tenant/{tenant_id}/size`, which which will respond with the actual calculated size. Initially the idea was to have this delivered as tenant background task and exported via metric, but it might be too computationally expensive to run it periodically as we don't yet know if the returned values are any good. Adds one new metric: - pageserver_storage_operations_seconds with label `logical_size` - separating from original `init_logical_size` Adds a pageserver wide configuration variable: - `concurrent_tenant_size_logical_size_queries` with default 1 This leaves a lot of TODO's, tracked on issue #2748.
Another related point tied to on-demand download. In current on-demand model we need to download significant amount of layers to calculate size. Would be good if we can make incremental calculation reliable through restarts, and if not we'll probably need some tweaking to place sizes in one layer (or metadata) so it is cheaper to obtain it on startup |
On the #2755 my selection of |
#3377 fixes bugs, adds test cases, some with skipped test annotations and changes how the initial tenant size is calculated to how it should had been all along. |
Initial PR: #2714, condensing the review comments and remaining TODO's, observations here.
Incremental size update ideas for both:
libs/tenant_size_model
pageserver::tenant::size::calculate_logical_size
test_get_tenant_size_with_multiple_branches is flaky: tenant size mismatch #2962
Make Postgres 15 default #2809 -- test failure blocking
#2817 will change a lot, but while that rewrite has been in progress, many of the issues are now handled.
Done or irrelevant now given all of the post-initial changes:
libs/tenant_size_model
(More tenant size fixes #3410)gc_horizon
(currently just zero)next_gc_cutoff
being on the ancestor_timeline (currently filtered out)retention_period
parameterThe text was updated successfully, but these errors were encountered: