-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iceberg reading with explicit Schema support #6124
Comments
I can also make an argument that our default should be:
IE, if providing a specific snapshot (and not a schema), the results would be the snapshot's data projected into the latest Table's schema. This has the advantage that Regardless, in either regime, the user can specify the schema, and so achieve the behavior they desire. |
The |
Q: does |
The same should also be applicable on iceberg writing side. |
I believe we need to offer Iceberg reading support based on a user-specified Schema (likely a Schema that was sourced from some history in the Table, or some subset); in the context where a user is passing in String column renames, the keys of that map are tied to a specific Schema (in this way also, it can internally be converted into a field-id rename, which can be used across column renaming). It may also be important in an enterprise for them to establish and record the specific Schema a table was initially ingested as (for example, they may want to enforce that
db.t("MyNamespace", "MyTable")
produces the same exact output today as it did yesterday). It is not enough to simply record the latest, or even a specific, Snapshot because the schema of the Table may change without a new Snapshot - for example, when a column is renamed, the Table's Schema will be updated, but no new Snapshots will be created.As a point of convention, it probably makes sense to assume these defaults (in order):
table.schemas().get(snapshot.schemaId())
table.schema()
; note, this is not equivalent totable.schemas().get(table.currentSnapshot().schemaId())
for reasons mentioned above.The text was updated successfully, but these errors were encountered: