Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Fix Metadata table's UUID #9310

Merged
merged 2 commits into from
Dec 16, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion core/src/main/java/org/apache/iceberg/BaseMetadataTable.java
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,15 @@ public abstract class BaseMetadataTable extends BaseReadOnlyTable
private final SortOrder sortOrder = SortOrder.unsorted();
private final BaseTable table;
private final String name;
private final UUID uuid;

protected BaseMetadataTable(Table table, String name) {
super("metadata");
Preconditions.checkArgument(
table instanceof BaseTable, "Cannot create metadata table for non-data table: %s", table);
this.table = (BaseTable) table;
this.name = name;
this.uuid = UUID.randomUUID();
}

/**
Expand Down Expand Up @@ -202,7 +204,7 @@ public Map<String, SnapshotRef> refs() {

@Override
public UUID uuid() {
return UUID.randomUUID();
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Dec 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I can't believe I wrote it like this originally :). Good fix, yes we do want a random UUID but it should be consistent across the same Metadata Table.

The only thing is if we extend this logic further is that the same base metadata table across different references won't return the same UUID. So for example "all manifests" for t1 may have a different UUID than "all manifests" for t1 in a different query.

I don't really see a way around that without persisting the UUID somewhere (like for normal tables that's in the metadata) but considering metadata tables are really more logical tables, I think that's OK. If we wanted that strict definition for metadata tables, then we would have to throw an unsupported until we persisted those details somewhere.

For now though, I think this change is good as is. It looks like in the next change you have the UUID is used as part of caching, in which case this fix should solve that.

return uuid;
}

@Override
Expand Down
13 changes: 13 additions & 0 deletions core/src/test/java/org/apache/iceberg/TestMetadataTableScans.java
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
import org.apache.iceberg.relocated.com.google.common.collect.Streams;
import org.apache.iceberg.types.Types;
import org.apache.iceberg.util.StructLikeWrapper;
import org.assertj.core.api.Assertions;
import org.junit.Assert;
import org.junit.Assume;
import org.junit.Test;
Expand Down Expand Up @@ -138,6 +139,18 @@ public void testManifestsTableAlwaysIgnoresResiduals() throws IOException {
}
}

@Test
public void testManifestsTableUUID() {
amogh-jahagirdar marked this conversation as resolved.
Show resolved Hide resolved
Table manifestsTable = new ManifestsTable(table);

Assertions.assertThat(manifestsTable.uuid())
.as("UUID should be consistent on multiple calls")
.isEqualTo(manifestsTable.uuid());
Assertions.assertThat(manifestsTable.uuid())
.as("Metadata table UUID should be different from main table UUID")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "should be different from the base table's UUID"

.isNotEqualTo(table.uuid());
}

@Test
public void testDataFilesTableWithDroppedPartition() throws IOException {
table.newFastAppend().appendFile(FILE_A).appendFile(FILE_B).commit();
Expand Down