We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In the absence of the need for optimizing, a Iceberg format table are still undergoing major optimizing repeatedly.
The table have enable full optimizing with configuration:
'self-optimizing.full.trigger.interval'='86400000'
master
AMS, Optimizer
CREATE TABLE spark_catalog.dl_ods.ods_iceberg_t1 ( channel_id INT, label STRING, price_sign BIGINT, item_id BIGINT, item_type INT, is_maintain INT, cur_name STRING, adjust_code STRING, platform_id BIGINT, business_id BIGINT NOT NULL, price DECIMAL(20,4), price_type_name STRING, price_type_id BIGINT, price_type_code STRING, sku_id BIGINT, goods_no STRING, spu_id BIGINT, store_id BIGINT, gmt_update TIMESTAMP, gmt_create TIMESTAMP, id BIGINT NOT NULL, store_status STRING, store_no STRING, com_id STRING, store_name STRING) USING iceberg PARTITIONED BY (business_id) LOCATION 'hdfs://xxxxx/user/hive/warehouse/datalake/dl_ods/ods_iceberg_t1' TBLPROPERTIES( 'clean-independent-delete-files.enabled' = 'true', 'clean-orphan-file.enabled' = 'true', 'clean-orphan-file.min-existing-time-minutes' = '1440', 'current-snapshot-id' = '5211867258629833319', 'engine.hive.enabled' = 'true', 'flink.max-continuous-empty-commits' = '2147483647', 'format' = 'iceberg/parquet', 'format-version' = '2', 'identifier-fields' = '[id,business_id]', 'self-optimizing.enabled' = 'true', 'self-optimizing.full.trigger.interval' = '-1', 'self-optimizing.group' = 'external-group', 'self-optimizing.quota' = '0.1', 'snapshot.base.keep.minutes' = '60', 'table-expire.enabled' = 'true', 'write.distribution-mode' = 'hash', 'write.metadata.delete-after-commit.enabled' = 'true', 'write.metadata.previous-versions-max' = '1', 'write.upsert.enabled' = 'true') ;
channel_id
label
price_sign
item_id
item_type
is_maintain
cur_name
adjust_code
platform_id
business_id
price
price_type_name
price_type_id
price_type_code
sku_id
goods_no
spu_id
store_id
gmt_update
gmt_create
id
store_status
store_no
com_id
store_name
No response
The text was updated successfully, but these errors were encountered:
Thanks for your report! I will add this issue to the roadmap for version 0.5.1 and look forward to your PR.👍
Sorry, something went wrong.
Successfully merging a pull request may close this issue.
What happened?
In the absence of the need for optimizing, a Iceberg format table are still undergoing major optimizing repeatedly.
The table have enable full optimizing with configuration:
Affects Versions
master
What engines are you seeing the problem on?
AMS, Optimizer
How to reproduce
CREATE TABLE spark_catalog.dl_ods.ods_iceberg_t1 (
channel_id
INT,label
STRING,price_sign
BIGINT,item_id
BIGINT,item_type
INT,is_maintain
INT,cur_name
STRING,adjust_code
STRING,platform_id
BIGINT,business_id
BIGINT NOT NULL,price
DECIMAL(20,4),price_type_name
STRING,price_type_id
BIGINT,price_type_code
STRING,sku_id
BIGINT,goods_no
STRING,spu_id
BIGINT,store_id
BIGINT,gmt_update
TIMESTAMP,gmt_create
TIMESTAMP,id
BIGINT NOT NULL,store_status
STRING,store_no
STRING,com_id
STRING,store_name
STRING)USING iceberg
PARTITIONED BY (business_id)
LOCATION 'hdfs://xxxxx/user/hive/warehouse/datalake/dl_ods/ods_iceberg_t1'
TBLPROPERTIES(
'clean-independent-delete-files.enabled' = 'true',
'clean-orphan-file.enabled' = 'true',
'clean-orphan-file.min-existing-time-minutes' = '1440',
'current-snapshot-id' = '5211867258629833319',
'engine.hive.enabled' = 'true',
'flink.max-continuous-empty-commits' = '2147483647',
'format' = 'iceberg/parquet',
'format-version' = '2',
'identifier-fields' = '[id,business_id]',
'self-optimizing.enabled' = 'true',
'self-optimizing.full.trigger.interval' = '-1',
'self-optimizing.group' = 'external-group',
'self-optimizing.quota' = '0.1',
'snapshot.base.keep.minutes' = '60',
'table-expire.enabled' = 'true',
'write.distribution-mode' = 'hash',
'write.metadata.delete-after-commit.enabled' = 'true',
'write.metadata.previous-versions-max' = '1',
'write.upsert.enabled' = 'true')
;
Relevant log output
No response
Anything else
No response
Are you willing to submit a PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: