-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMORO-1457]Support computed columns and watermark in Flink DDL #2239
[AMORO-1457]Support computed columns and watermark in Flink DDL #2239
Conversation
some unit test failed, I'll try to fix it |
@xieyi888 there are some UTs failed about the amoro's schema fetching, could you kindly have a time to take a look? TestJoin.testLookupJoin:221->FlinkTestBase.exec:203->FlinkTestBase.exec:207 » IllegalArgument Field op_time not found in source schema |
8a41de6
to
579ddac
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2239 +/- ##
============================================
+ Coverage 52.65% 53.02% +0.37%
+ Complexity 4219 3632 -587
============================================
Files 511 465 -46
Lines 29358 24650 -4708
Branches 2853 2340 -513
============================================
- Hits 15457 13070 -2387
+ Misses 12653 10560 -2093
+ Partials 1248 1020 -228
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Thanks , I had separate function getPhysicalSchemaForDimTable when dim_table.enabled=true. TestJoin can work normally |
… AMORO_1457_support_compute
# Conflicts: # flink/v1.15/flink/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java # flink/v1.15/flink/src/main/java/com/netease/arctic/flink/catalog/ArcticCatalog.java # flink/v1.15/flink/src/main/java/com/netease/arctic/flink/table/DynamicTableFactory.java # flink/v1.15/flink/src/main/java/com/netease/arctic/flink/table/descriptors/ArcticValidator.java # flink/v1.15/flink/src/main/java/com/netease/arctic/flink/write/FlinkSink.java # flink/v1.15/flink/src/test/java/com/netease/arctic/flink/catalog/TestCatalog.java
@YesOrNo828 @zhoujinsong Could you please review this pr? |
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xieyi888 Thanks for the PR, I left some comments, please kindly have a look.
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/catalog/ArcticCatalog.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/table/DynamicTableFactory.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xieyi888 Thanks a lot for your work!
I left some comments. Please take another look.
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/catalog/ArcticCatalog.java
Outdated
Show resolved
Hide resolved
BTW, I'm curious about the property design of compute columns. The index in the property key doesn't seem to have a specific use. Would a design like this be better for users to understand these properties, such as:
HDYT? @xieyi888 @YesOrNo828 |
# Conflicts: # flink/flink-common/src/test/java/com/netease/arctic/flink/catalog/TestCatalog.java
4882f10
to
2d8bd55
Compare
The index is depedended on flink TableSchema index. It just to specify the different columns. Although It seems not very convenient for users to understand the properties. It's better for serialize/deserialize When the field name contains some exception string. HDYT? @YesOrNo828 |
Yes, Including column names in property names can indeed introduce some special characters. So perhaps we can still use an index to differentiate different compute columns. The index would only count the number of compute columns, starting from 1, and we would maintain this order when reassembling the schema. This might make it easier for users to understand. |
I agree with using the index to determine the order of fields(including physical and computed columns) instead of the field name. There would be a data type mismatching exception. If we only use an index to differentiate different compute columns, not include physical columns.
The query of |
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Show resolved
Hide resolved
I agree with it.
if it's computed column, we also serialized the expresion
if it's computed column, we also serialized metadata and virtual
HDYT @YesOrNo828 @zhoujinsong |
if physical column was deleted by spark or other engine, property for compute column would be destroed and leading to deserializing compute column inaccuracy, we deside to make compute column listed after all physical columns and validate column order during create table |
…888/arctic into AMORO_1457_support_compute
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments, please task another look.
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/catalog/ArcticCatalog.java
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Outdated
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/FlinkSchemaUtil.java
Show resolved
Hide resolved
flink/flink-common/src/main/java/com/netease/arctic/flink/table/DynamicTableFactory.java
Outdated
Show resolved
Hide resolved
Yi Xie seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Yi Xie seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks a lot for your contribution!
…he#2239) * supoort_colume and watermark for flink1.15 * separate getPhysicalSchemaForDimTable when dim_table.enabled=true * rm ut for flink1.15 * change for review comments * change for master conflict * fix checkstyle * change compute index start from1 * change method name * change doc for computr column * fix check style * fix for comments --------- Co-authored-by: Yi Xie <xieyi01@rd.netease.com> Co-authored-by: ZhouJinsong <zhoujinsong0505@163.com>
Why are the changes needed?
Close #1457 .
Brief change log
suppoort serialized compute columns and watermarks into arctic table properties
compute columns and watermark in arctic properties like below
support deserializing compute columns and watermarks into flink tableSchema
Only convert flink physicalColums into arctic(iceberg) schema during create table, compute columns and watermarks should be serialized into arctic table properties
when get table(write and read), deserializing compute columns and watermarks into flink TableSchema ,and combine with flink TableSchema convert by arctic(iceberg) schema
only support one watermark strategy
not support alter properties for compute columns and watermark strategy now.
How was this patch tested?
Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before making a pull request
Local test includes as below:
flink 1.15, mix_iceberg
1.1 create arctic table with compute columns and watermark under flink arctic catalog.
1.2 read and write table with compute columns.
1.3 read arctic table and do lookup join with mysql dimension table
2.1. create arctic table with compute columns under flink default catalog. read with compute columns.
2.2 read arctic table and do lookup join with mysql dimension table
Documentation
yes
docs