-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pageserver: refactor TenantId to TenantShardId in Tenant & Timeline #5957
Conversation
2400 tests run: 2303 passed, 0 failed, 97 skipped (full report)Flaky tests (5)Postgres 14Code coverage (full report)
The comment gets automatically updated with the latest test results
02b8603 at 2023-11-29T15:03:48.860Z :recycle: |
31f613a
to
ef28282
Compare
ef28282
to
80e6666
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the debug_assert_current_span_has_tenant_id
and debug_assert_current_span_has_tenant_and_timeline_id
now require shard_id
as well?
Yes indeed, but later: I intend to do that once all the TenantId locations are updated to be shard-aware (this PR is just the primary Tenant/Timeline path) |
80e6666
to
e6ec401
Compare
…5960) Precursor for #5957 ## Problem When DeletionList was written, TenantId/TimelineId didn't have human-friendly modes in their serde. #5335 added those, such that the helpers used in serialization of HashMaps are no longer necessary. ## Summary of changes - Add a unit test to ensure that this change isn't changing anything about the serialized form - Remove the serialization helpers for maps of Id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you ensure that in all the places we use Serialize, Display, Debug impls for TenantShardId, it's actually supposed to print the full TenantShardId and not just TenantId?
It wouldn't be noticable right now because we only have TenantShardId::unsharded
. But, it could bite us at some later point.
For example, how did you ensure you caught all the #[instrument()]
cases.
The debug_assert_span_contains...
is useful for that.
Lastly I wonder whether our tracing spans should includethe monotonic version number of the sharding config (I assume such a thing exists somewhere outside of PS).
e6ec401
to
02b8603
Compare
## Problem In #5957, the most essential types were updated to use TenantShardId rather than TenantId. That unblocked other work, but didn't fully enable running multiple shards from the same tenant on the same pageserver. ## Summary of changes - Use TenantShardId in page cache key for materialized pages - Update mgr.rs get_tenant() and list_tenants() functions to use a shard id, and update all callers. - Eliminate the exactly_one_or_none helper in mgr.rs and all code that used it - Convert timeline HTTP routes to use tenant_shard_id Note on page cache: ``` struct MaterializedPageHashKey { /// Why is this TenantShardId rather than TenantId? /// /// Usually, the materialized value of a page@lsn is identical on any shard in the same tenant. However, this /// this not the case for certain internally-generated pages (e.g. relation sizes). In future, we may make this /// key smaller by omitting the shard, if we ensure that reads to such pages always skip the cache, or are /// special-cased in some other way. tenant_shard_id: TenantShardId, timeline_id: TimelineId, key: Key, } ```
(includes two preparatory commits from #5960)
Problem
To accommodate multiple shards in the same tenant on the same pageserver, we must include the full TenantShardId in local paths. That means that all code touching local storage needs to see the TenantShardId.
Summary of changes
tenant_id: TenantId
withtenant_shard_id: TenantShardId
on Tenant, Timeline and RemoteTimelineClient.This doesn't update absolutely everything: things like PageCache, TaskMgr, WalRedo are still shard-naive. The purpose of this PR is to update the core types so that others code can be added/updated incrementally without churning the most central shared types.