Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Fix Metadata table's UUID #9310

Merged
merged 2 commits into from
Dec 16, 2023
Merged

Conversation

ajantha-bhat
Copy link
Member

BaseFileRewriteCoordinator, ScanTaskSetManager calls table.uuid() of metadata table multiple times for setting and reading from cache. So, uuid has to be consistent on each call for a given table.

This issue was discovered during #9298

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @ajantha-bhat just a nit

.as("UUID should be consistent on multiple calls")
.isEqualTo(manifestsTable.uuid());
Assertions.assertThat(manifestsTable.uuid())
.as("Metadata table UUID should be different from main table UUID")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "should be different from the base table's UUID"

@@ -202,7 +204,7 @@ public Map<String, SnapshotRef> refs() {

@Override
public UUID uuid() {
return UUID.randomUUID();
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Dec 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I can't believe I wrote it like this originally :). Good fix, yes we do want a random UUID but it should be consistent across the same Metadata Table.

The only thing is if we extend this logic further is that the same base metadata table across different references won't return the same UUID. So for example "all manifests" for t1 may have a different UUID than "all manifests" for t1 in a different query.

I don't really see a way around that without persisting the UUID somewhere (like for normal tables that's in the metadata) but considering metadata tables are really more logical tables, I think that's OK. If we wanted that strict definition for metadata tables, then we would have to throw an unsupported until we persisted those details somewhere.

For now though, I think this change is good as is. It looks like in the next change you have the UUID is used as part of caching, in which case this fix should solve that.

@amogh-jahagirdar
Copy link
Contributor

@ajantha-bhat the same unrelated Flink flaky test is occurring that #9309 is fixing. I'm going to close and re-open this PR to retrigger CI

@amogh-jahagirdar amogh-jahagirdar merged commit 24578a2 into apache:main Dec 16, 2023
62 of 82 checks passed
lisirrx pushed a commit to lisirrx/iceberg that referenced this pull request Jan 4, 2024
geruh pushed a commit to geruh/iceberg that referenced this pull request Jan 26, 2024
devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants