-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Description
Apache Iceberg version
0.14.1
Query engine
Flink
Please describe the bug 🐞
flink: 1.13.5
iceberg: 1.13.2 / 1.14.1
When using Rewrite files action API rewriteDataFiles(), the new compressed file is generated without a corresponding manifest file, I tried iceberg versions 1.13.2 and 1.14.1 which has a similar problem under the iceberg-catalog of Hive and Hadoop.
The Iceberg Maven dependent, table structure and code to compress the file using the Java API is as follows:
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-flink-runtime-1.13</artifactId>
<!-- <version>0.13.2</version>-->
<version>0.14.1</version>
</dependency>
name: iceberg_hive_catalog
type: iceberg
catalog-type: hive
uri: thrift://xxxxx:9083
clients: 5
property-version: 1
warehouse: hdfs://nameservice1/user/hive/warehouse/
create table iceberg_hive_catalog.dhome_db.ods_d_base_inf_229_iceberg (
`did` string,
`name` string,
`address` string,
`did_seq` string,
PRIMARY KEY (did_seq) NOT ENFORCED
) with (
'format-version'='2',
'write.upsert.enabled'='true',
'write.metadata.delete-after-commit.enabled'='true',
'write.metadata.previous-versions-max'='5',
'flink.rewrite.enable' = 'true',
'flink.rewrite.parallelism' = '5',
'flink.rewrite.target-file-size-bytes' = '536870912',
'flink.rewrite.max-files-count' = '5'
);
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
TableIdentifier identifier = TableIdentifier.of("dhome_db", "ods_d_base_inf_229_iceberg");
TableLoader tableLoader = TableLoader.fromCatalog(hive_iceberg, identifier);
tableLoader.open();
Table table_iceberg = tableLoader.loadTable();
Actions.forTable(env, table_iceberg)
.rewriteDataFiles()
.maxParallelism(5)
.targetSizeInBytes(128*1024*1024)
.execute();
The results:
If there is anything wrong with the question, please correct it, thank you.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels