-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix](paimon)Handle oversized CHAR/VARCHAR fields in Paimon to Doris type mapping #55051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](paimon)Handle oversized CHAR/VARCHAR fields in Paimon to Doris type mapping #55051
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 34695 ms |
TPC-DS: Total hot run time: 185749 ms |
ClickBench: Total hot run time: 32.36 s |
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 33797 ms |
TPC-DS: Total hot run time: 185477 ms |
ClickBench: Total hot run time: 33.21 s |
FE UT Coverage ReportIncrement line coverage |
suxiaogang223
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by anyone and no changes requested. |
|
PR approved by at least one committer and no changes requested. |
…type mapping (#55051) ### What problem does this PR solve? In PR #49623, we implemented conversion from Paimon `VARCHAR/CHAR` types to Doris `VARCHAR/CHAR` types. However, there are significant differences in the maximum length constraints between these systems: **Apache Paimon:** - `CHAR` : Fixed-length character string declared using CHAR(n) where n is the number of code points. n must have a value between `1` and `2,147,483,647` (inclusive). Defaults to n=1 if no length is specified. - `VARCHAR`: Variable-length character string declared using VARCHAR(n) where n is the maximum number of code points. n must have a value between `1` and `2,147,483,647` (inclusive). Defaults to n=1 if no length is specified. **Apache Doris:** - `CHAR `: Maximum length is `255` characters - `VARCHAR` : Maximum length is `65,533` characters **Solution:** This PR addresses the length constraint mismatch by automatically converting oversized Paimon VARCHAR/CHAR types to Doris STRING type when they exceed Doris limits: - Paimon `VARCHAR` with length > 65,533 → Doris `STRING` - Paimon `CHAR` with length > 255 → Doris `STRING` This ensures compatibility while preserving data integrity during type mapping from Paimon to Doris.
…type mapping (apache#55051) ### What problem does this PR solve? In PR apache#49623, we implemented conversion from Paimon `VARCHAR/CHAR` types to Doris `VARCHAR/CHAR` types. However, there are significant differences in the maximum length constraints between these systems: **Apache Paimon:** - `CHAR` : Fixed-length character string declared using CHAR(n) where n is the number of code points. n must have a value between `1` and `2,147,483,647` (inclusive). Defaults to n=1 if no length is specified. - `VARCHAR`: Variable-length character string declared using VARCHAR(n) where n is the maximum number of code points. n must have a value between `1` and `2,147,483,647` (inclusive). Defaults to n=1 if no length is specified. **Apache Doris:** - `CHAR `: Maximum length is `255` characters - `VARCHAR` : Maximum length is `65,533` characters **Solution:** This PR addresses the length constraint mismatch by automatically converting oversized Paimon VARCHAR/CHAR types to Doris STRING type when they exceed Doris limits: - Paimon `VARCHAR` with length > 65,533 → Doris `STRING` - Paimon `CHAR` with length > 255 → Doris `STRING` This ensures compatibility while preserving data integrity during type mapping from Paimon to Doris. (cherry picked from commit 6622f50)
### What problem does this PR solve? Related PR: apache#55051 apache#55070 apache#55051 changed the paimon Type from `varchar(2147483647)` to `text` apache#55070 added the case
What problem does this PR solve?
In PR #49623, we implemented conversion from Paimon
VARCHAR/CHARtypes to DorisVARCHAR/CHARtypes. However, there are significant differences in the maximum length constraints between these systems:Apache Paimon:
CHAR: Fixed-length character string declared using CHAR(n) where n is the number of code points. n must have a value between1and2,147,483,647(inclusive). Defaults to n=1 if no length is specified.VARCHAR: Variable-length character string declared using VARCHAR(n) where n is the maximum number of code points. n must have a value between1and2,147,483,647(inclusive). Defaults to n=1 if no length is specified.Apache Doris:
CHAR: Maximum length is255charactersVARCHAR: Maximum length is65,533charactersSolution:
This PR addresses the length constraint mismatch by automatically converting oversized Paimon VARCHAR/CHAR types to Doris STRING type when they exceed Doris limits:
VARCHARwith length > 65,533 → DorisSTRINGCHARwith length > 255 → DorisSTRINGThis ensures compatibility while preserving data integrity during type mapping from Paimon to Doris.
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)