-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse the partition structure using the delta_log as the base path #510
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Could you maybe add a simple example to the PR description so it's clear what kind of table paths this PR fixes?
@@ -1036,4 +1036,19 @@ trait ConvertToDeltaHiveTableTests extends ConvertToDeltaTestUtils with SQLTestU | |||
} | |||
} | |||
} | |||
|
|||
test("convert use the base path to parse partition structure") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe change the name to reflect the outcome not the implementation, e.g., can convert a partition-like table path
This allows to convert a location that is under a path that looks like a partition. Signed-off-by: Mike Dias <mike.rodrigues.dias@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for fixing this.
…g as the base path This allows converting a location that is under a path that looks like a partition value. For example, running the convert to delta command over a path like `s3://massive-events/year=2020/` would fail because the command will try to compare the partitions above the delta log base path. Signed-off-by: Mike Dias <mike.rodrigues.dias@gmail.com> Closes delta-io#510 Signed-off-by: liwensun <liwen.sun@databricks.com> Author: Mike Dias <mike.rodrigues.dias@gmail.com> #12643 is resolved by liwensun/4w7hocg4. GitOrigin-RevId: 4dc2e55c97b47b3ce928a5a780b115324170e95d (cherry picked from commit ca37b4a)
* [FlinkSQL_PR_1] Flink Delta Sink - Table API UPDATED (delta-io#389) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> Signed-off-by: Krzysztof Chmielewski <krzysztof.chmielewski@getindata.com> Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> Co-authored-by: Paweł Kubit <pawel.kubit@getindata.com> Co-authored-by: Krzysztof Chmielewski <krzysztof.chmielewski@getindata.com> * [FlinkSQL_PR_2] - SQL Support for Delta Source connector. (delta-io#487) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_3] - Delta catalog skeleton (delta-io#503) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_4] - Delta catalog - Interactions with DeltaLog. Create and get table. (delta-io#506) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_5] - Delta catalog - DDL option validation. (delta-io#509) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_6] - Delta catalog - alter table + tests. (delta-io#510) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_7] - Delta catalog - Restrict Delta Table factory to work only with Delta Catalog + tests. (delta-io#514) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_8] - Delta Catalog - DDL/Query hint validation + tests. (delta-io#520) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_9] - Delta Catalog - Adding Flink's Hive catalog as decorated catalog. (delta-io#524) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_10] - Table API support SELECT with filter on partition column. (delta-io#528) * [FlinkSQL_PR_10] - Table API support SELECT with filter on partition column. --------- Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> Co-authored-by: Scott Sandre <scott.sandre@databricks.com> * [FlinkSQL_PR_11] - Delta Catalog - cache DeltaLog instances in DeltaCatalog. (delta-io#529) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_12] - UML diagrams. (delta-io#530) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_13] - Remove mergeSchema option from SQL API. (delta-io#531) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_14] - SQL examples. (delta-io#535) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * remove duplicate function after rebasing against master --------- Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> Signed-off-by: Krzysztof Chmielewski <krzysztof.chmielewski@getindata.com> Co-authored-by: kristoffSC <krzysiek.chmielewski@gmail.com> Co-authored-by: Paweł Kubit <pawel.kubit@getindata.com> Co-authored-by: Krzysztof Chmielewski <krzysztof.chmielewski@getindata.com>
* [FlinkSQL_PR_1] Flink Delta Sink - Table API UPDATED (delta-io#389) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> Signed-off-by: Krzysztof Chmielewski <krzysztof.chmielewski@getindata.com> Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> Co-authored-by: Paweł Kubit <pawel.kubit@getindata.com> Co-authored-by: Krzysztof Chmielewski <krzysztof.chmielewski@getindata.com> * [FlinkSQL_PR_2] - SQL Support for Delta Source connector. (delta-io#487) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_3] - Delta catalog skeleton (delta-io#503) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_4] - Delta catalog - Interactions with DeltaLog. Create and get table. (delta-io#506) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_5] - Delta catalog - DDL option validation. (delta-io#509) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_6] - Delta catalog - alter table + tests. (delta-io#510) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_7] - Delta catalog - Restrict Delta Table factory to work only with Delta Catalog + tests. (delta-io#514) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_8] - Delta Catalog - DDL/Query hint validation + tests. (delta-io#520) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_9] - Delta Catalog - Adding Flink's Hive catalog as decorated catalog. (delta-io#524) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_10] - Table API support SELECT with filter on partition column. (delta-io#528) * [FlinkSQL_PR_10] - Table API support SELECT with filter on partition column. --------- Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> Co-authored-by: Scott Sandre <scott.sandre@databricks.com> * [FlinkSQL_PR_11] - Delta Catalog - cache DeltaLog instances in DeltaCatalog. (delta-io#529) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_12] - UML diagrams. (delta-io#530) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_13] - Remove mergeSchema option from SQL API. (delta-io#531) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * [FlinkSQL_PR_14] - SQL examples. (delta-io#535) Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> * remove duplicate function after rebasing against master --------- Signed-off-by: Krzysztof Chmielewski <krzysiek.chmielewski@gmail.com> Signed-off-by: Krzysztof Chmielewski <krzysztof.chmielewski@getindata.com> Co-authored-by: kristoffSC <krzysiek.chmielewski@gmail.com> Co-authored-by: Paweł Kubit <pawel.kubit@getindata.com> Co-authored-by: Krzysztof Chmielewski <krzysztof.chmielewski@getindata.com>
This allows converting a location that is under a path that looks like a partition value. For example, running the convert to delta command over a path like
s3://massive-events/year=2020/
would fail because the command will try to compare the partitions above the delta log base path.Signed-off-by: Mike Dias mike.rodrigues.dias@gmail.com