Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate safekeeper unique IDs. #1228

Closed
wants to merge 1 commit into from
Closed

Generate safekeeper unique IDs. #1228

wants to merge 1 commit into from

Conversation

arssher
Copy link
Contributor

@arssher arssher commented Feb 6, 2022

In preparation for storage node messaging.

Copy link
Contributor

@LizardWizzard LizardWizzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok to me. Would be good to further unify NodeIds for safekeeper and pageserver. I can improve on that when I'll be working on pageserver side

walkeeper/src/bin/safekeeper.rs Outdated Show resolved Hide resolved
zenith_utils/src/zid.rs Outdated Show resolved Hide resolved
@arssher
Copy link
Contributor Author

arssher commented Feb 14, 2022

Updated patch makes ids monotonic with decimal representation. @hlinnaka note that this id completely replaces textual 'name' currently known by zenith CLI/fixtures. We can add it back as some optional human readable name, but I doubt it makes much sense now.

Copy link
Contributor

@LizardWizzard LizardWizzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still do not get it why we need different flavours of ids. Reposting here for visibility.

If we need to distinguish between entities that ids belong to, lets do that explicitly. We can reserve some of these id bits to some tag value and use it in Debug/Display impls. So id will look something like that: node/abcd12354 or tenant/1234abcd. IMO it is better than some heuristics. Also if our ids are too long, lets shrink them to minimal possible length. e g to have 8 bytes instead of 16.

@arssher
Copy link
Contributor Author

arssher commented Feb 15, 2022

As said in another PR, I'm for that variant mostly due to shortness/readability/being easy to remember. Right now tenant id + timeline id are taking about half of space of log line; imagine adding to that global installation id + shard id once we get to sharding + storage node id, perhaps several (e.g. pageserver and safekeepers). Most of our logs would be ids %).

If you want another example of such "mixed" ids, consider CockroachDB: they have uuid for global cluster id and sequential decimal ids for node ids. For the same reasons, I would expect.

Random 64 bit ids combine both disadvantages: they are not long enough to regard them unique without additional checks (unlike uuid) and still all values are quite long.

Still, I don't care very much here, I just want to proceed. I wouldn't extremely mind the alternative, and it is not very hard to change later.

Copy link
Contributor

@LizardWizzard LizardWizzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one uuid has enough uniqueness can we shrink tenant/timeline ids in half? So together they form this cluster uuid.

I'm not strongly opposed to sequential ids. My concern is having too many concepts to deal with. If one type of id is enough, then why bother and invent new ones? We can have one id with some tag in the first byte which describes what kind of id it is without guessing

Regarding logs I think it is inevitable that more and more variables are being added to message context. Modern logging follows machine readable path, e.g json. With that we can filter out the noise.

Feel free to commit this if you're in a hurry.

walkeeper/src/bin/safekeeper.rs Outdated Show resolved Hide resolved
In preparation for storage node messaging. IDs are supposed to be monotonically
assigned by the console. In tests it is issued by ZenithEnv; at the zenith cli
level and fixtures, string name is completely replaced by integer id. Example
TOML configs are adjusted accordingly.

Sequential ids are chosen over Zid mainly because they are compact and easy to
type/remember.
@arssher
Copy link
Contributor Author

arssher commented Mar 4, 2022

Merged through #1310

@arssher arssher closed this Mar 4, 2022
@bayandin bayandin deleted the sk_id branch May 19, 2023 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants