-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core,API: Set 503: added_snapshot_id
as required
#11626
base: main
Are you sure you want to change the base?
Conversation
c771cbe
to
bea92b6
Compare
`503: added_snapshot_id` field should be written as a required field, but currently it is written as optional. As the reference implementation should produce metadata that is as close to the spec as possible. For reading, this isn't a problem with the current Java implementation as it will still read optional fields as required, but only thrown an error when it encounters a `null` value.
bea92b6
to
a74e76e
Compare
try { | ||
this.length = toCopy.length(); | ||
} catch (UnsupportedOperationException e) { | ||
// Can be removed when embedded manifests are dropped | ||
// DummyFileIO does not support .length() | ||
this.length = null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See
iceberg/core/src/main/java/org/apache/iceberg/SnapshotParser.java
Lines 179 to 218 in a52afdc
/** | |
* The main purpose of this class is to lazily retrieve the path from a v1 Snapshot that has | |
* manifest lists | |
*/ | |
private static class DummyFileIO implements FileIO { | |
@Override | |
public InputFile newInputFile(String path) { | |
return new InputFile() { | |
@Override | |
public long getLength() { | |
throw new UnsupportedOperationException(); | |
} | |
@Override | |
public SeekableInputStream newStream() { | |
throw new UnsupportedOperationException(); | |
} | |
@Override | |
public String location() { | |
return path; | |
} | |
@Override | |
public boolean exists() { | |
return true; | |
} | |
}; | |
} | |
@Override | |
public OutputFile newOutputFile(String path) { | |
throw new UnsupportedOperationException(); | |
} | |
@Override | |
public void deleteFile(String path) { | |
throw new UnsupportedOperationException(); | |
} | |
} |
GenericManifestFile.copyOf( | ||
new GenericManifestFile(localInput("file:/tmp/manifest1.avro"), 0)) | ||
.withSnapshotId(snapshotId) | ||
.build(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: it looks like this is done quite a bit. What about making it easier either with a test helper or a better builder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. I'm trying to think if there was a reason why the spec requires snapshot ID but we didn't write it that way. I can't think of what might have caused this. Good catch!
503: added_snapshot_id
field should be written as a required field, but currently it is written as optional.As the reference implementation should produce metadata that is as close to the spec as possible.
For reading, this isn't a problem with the current Java implementation as it will still read optional fields as required, but only thrown an error when it encounters a
null
value.