-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[opt](iceberg) support create branch/tag for iceberg #51727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
2 similar comments
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 35852 ms |
TPC-DS: Total hot run time: 197410 ms |
ClickBench: Total hot run time: 29.14 s |
08791a9 to
b113839
Compare
|
run buildall |
TPC-H: Total hot run time: 33983 ms |
TPC-DS: Total hot run time: 185994 ms |
ClickBench: Total hot run time: 29.45 s |
|
run buildall |
TPC-H: Total hot run time: 33961 ms |
TPC-DS: Total hot run time: 185428 ms |
ClickBench: Total hot run time: 29.13 s |
edbd112 to
e1eb917
Compare
|
run buildall |
TPC-H: Total hot run time: 34614 ms |
TPC-DS: Total hot run time: 186466 ms |
ClickBench: Total hot run time: 29.77 s |
FE UT Coverage ReportIncrement line coverage |
a5f1a1a to
02d730c
Compare
|
run buildall |
TPC-H: Total hot run time: 34097 ms |
TPC-DS: Total hot run time: 185314 ms |
ClickBench: Total hot run time: 30.01 s |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
Related Issue: apache#48285 Problem Summary: You can now create Iceberg branches and tags through Doris. ```sql -- branch alter table tb1 create branch b1; alter table tb1 create branch if not exists b1; alter table tb1 create or replace branch b1; alter table tb1 create or replace branch b1 AS OF VERSION <version>; alter table tb1 create or replace branch b1 AS OF VERSION <version> RETAIN 1 HOURS; -- CREATE b1 at snapshot 1234, retain b1 for 30 days, and retain the latest 30 days. The latest 3 snapshot snapshots, and 2 days worth of snapshots. alter table tb1 CREATE BRANCH b1 AS OF VERSION 1234 RETAIN 30 DAYS WITH SNAPSHOT RETENTION 3 SNAPSHOTS 2 DAYS; -- tag alter table tb1 create tag t1; alter table tb1 create tag if not eists t1; alter table tb1 create or replace tag t1; alter table tb1 create or replace tag t1 AS OF VERSION <version>; -- CREATE t1 at snapshot 1234 and retain it for 1 year. alter table tb1 create tag t1 AS OF VERSION 1234 RETAIN 365 DAYS ``` The supported time units include: DAYS, HOURS, MINUTES
Related Issue: apache#48285 Problem Summary: You can now create Iceberg branches and tags through Doris. ```sql -- branch alter table tb1 create branch b1; alter table tb1 create branch if not exists b1; alter table tb1 create or replace branch b1; alter table tb1 create or replace branch b1 AS OF VERSION <version>; alter table tb1 create or replace branch b1 AS OF VERSION <version> RETAIN 1 HOURS; -- CREATE b1 at snapshot 1234, retain b1 for 30 days, and retain the latest 30 days. The latest 3 snapshot snapshots, and 2 days worth of snapshots. alter table tb1 CREATE BRANCH b1 AS OF VERSION 1234 RETAIN 30 DAYS WITH SNAPSHOT RETENTION 3 SNAPSHOTS 2 DAYS; -- tag alter table tb1 create tag t1; alter table tb1 create tag if not eists t1; alter table tb1 create or replace tag t1; alter table tb1 create or replace tag t1 AS OF VERSION <version>; -- CREATE t1 at snapshot 1234 and retain it for 1 year. alter table tb1 create tag t1 AS OF VERSION 1234 RETAIN 365 DAYS ``` The supported time units include: DAYS, HOURS, MINUTES
…comprehensive test coverage (#59917) - Related Pr: #51727 ### What problem does this PR solve? ## Overview This PR improves Iceberg table branch/tag functionality by adding input parameter validation, optimizing snapshot loading logic, and significantly expanding test coverage. ## Iceberg Logic Modifications ### 1. Optimize Snapshot Loading **File**: `fe/fe-core/src/main/java/org/apache/doris/nereids/StatementContext.java` - Modified `loadSnapshots` method signature to accept `specificTable` parameter - Supports loading snapshot for a specific table instead of iterating through all tables - Caller (`BindRelation.java:419`) passes the concrete table object, improving performance and precision ### 2. Enhanced Parameter Validation **File**: `fe/fe-core/src/main/java/org/apache/doris/nereids/parser/LogicalPlanBuilder.java` Added parameter validation for branch/tag creation operations: - `RETAIN` time value must be greater than 0 - `SNAPSHOTS` (minimum snapshots to keep) must be greater than 0 - `RETENTION` time must be greater than 0 - Throws clear `IllegalArgumentException` with descriptive error messages ### 3. Branch/Tag Name Validation **File**: `fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergMetadataOps.java` - Validate branch name is not empty when creating a branch - Validate tag name is not empty when creating a tag - Throws `UserException` to notify users ### 4. Syntax Support Limitation **File**: `regression-test/suites/external_table_p0/iceberg/iceberg_branch_tag_operate.groovy` Clarified that `CREATE OR REPLACE BRANCH IF NOT EXISTS` syntax combination is not supported ## New Test Case Coverage This PR adds 10 comprehensive test suites covering the following scenarios: | Test File | Coverage | |-----------|----------| | `iceberg_branch_complex_queries.groovy` | Complex query scenarios with branch operations | | `iceberg_branch_cross_operations.groovy` | Cross operations between branches and tags | | `iceberg_branch_partition_operations.groovy` | Partition-related branch operations | | `iceberg_branch_retention_and_snapshot.groovy` | Snapshot expiration and retention policies | | `iceberg_branch_tag_auth.groovy` | Branch/tag permission and authorization | | `iceberg_branch_tag_edge_cases.groovy` | Edge cases and exception handling | | `iceberg_branch_tag_parallel_op.groovy` | Concurrent/parallel operations testing | | `iceberg_branch_tag_schema_change_extended.groovy` | Schema change scenarios | | `iceberg_branch_tag_system_tables.groovy` | System table query verification | | `iceberg_tag_retention_and_consistency.groovy` | Tag consistency validation | ## Improvements - **More precise snapshot loading**: Avoids unnecessary full table iteration - **Stronger parameter validation**: Catches configuration errors early with clear error messages - **Comprehensive test coverage**: Multi-dimensional validation from edge cases to concurrent operations Co-authored-by: zgxme <u143@qq.com>
…comprehensive test coverage (#59917) - Related Pr: #51727 ### What problem does this PR solve? ## Overview This PR improves Iceberg table branch/tag functionality by adding input parameter validation, optimizing snapshot loading logic, and significantly expanding test coverage. ## Iceberg Logic Modifications ### 1. Optimize Snapshot Loading **File**: `fe/fe-core/src/main/java/org/apache/doris/nereids/StatementContext.java` - Modified `loadSnapshots` method signature to accept `specificTable` parameter - Supports loading snapshot for a specific table instead of iterating through all tables - Caller (`BindRelation.java:419`) passes the concrete table object, improving performance and precision ### 2. Enhanced Parameter Validation **File**: `fe/fe-core/src/main/java/org/apache/doris/nereids/parser/LogicalPlanBuilder.java` Added parameter validation for branch/tag creation operations: - `RETAIN` time value must be greater than 0 - `SNAPSHOTS` (minimum snapshots to keep) must be greater than 0 - `RETENTION` time must be greater than 0 - Throws clear `IllegalArgumentException` with descriptive error messages ### 3. Branch/Tag Name Validation **File**: `fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergMetadataOps.java` - Validate branch name is not empty when creating a branch - Validate tag name is not empty when creating a tag - Throws `UserException` to notify users ### 4. Syntax Support Limitation **File**: `regression-test/suites/external_table_p0/iceberg/iceberg_branch_tag_operate.groovy` Clarified that `CREATE OR REPLACE BRANCH IF NOT EXISTS` syntax combination is not supported ## New Test Case Coverage This PR adds 10 comprehensive test suites covering the following scenarios: | Test File | Coverage | |-----------|----------| | `iceberg_branch_complex_queries.groovy` | Complex query scenarios with branch operations | | `iceberg_branch_cross_operations.groovy` | Cross operations between branches and tags | | `iceberg_branch_partition_operations.groovy` | Partition-related branch operations | | `iceberg_branch_retention_and_snapshot.groovy` | Snapshot expiration and retention policies | | `iceberg_branch_tag_auth.groovy` | Branch/tag permission and authorization | | `iceberg_branch_tag_edge_cases.groovy` | Edge cases and exception handling | | `iceberg_branch_tag_parallel_op.groovy` | Concurrent/parallel operations testing | | `iceberg_branch_tag_schema_change_extended.groovy` | Schema change scenarios | | `iceberg_branch_tag_system_tables.groovy` | System table query verification | | `iceberg_tag_retention_and_consistency.groovy` | Tag consistency validation | ## Improvements - **More precise snapshot loading**: Avoids unnecessary full table iteration - **Stronger parameter validation**: Catches configuration errors early with clear error messages - **Comprehensive test coverage**: Multi-dimensional validation from edge cases to concurrent operations Co-authored-by: zgxme <u143@qq.com>
What problem does this PR solve?
Related Issue: #48285
Problem Summary:
You can now create Iceberg branches and tags through Doris.
The supported time units include: DAYS, HOURS, MINUTES
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)