-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix](hive)fix querying hive text table with NULL DEFINED AS '' #55626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
9e9d51a to
06c9e3b
Compare
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 34052 ms |
TPC-DS: Total hot run time: 186395 ms |
ClickBench: Total hot run time: 33.09 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 34128 ms |
TPC-DS: Total hot run time: 186808 ms |
ClickBench: Total hot run time: 32.68 s |
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
### What problem does this PR solve? Problem Summary: This pull request improves the handling of empty string null formats and delimiter properties for Hive external tables, ensuring more robust and consistent behavior when parsing data. For hive text table like this: ```sql CREATE TABLE test_empty_null_defined_text ( id INT, name STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' NULL DEFINED AS '' STORED AS TEXTFILE; INSERT INTO TABLE test_empty_null_defined_text VALUES (1, 'Alice'), (2, NULL); ``` Query in Doris: ```sql select * from test_empty_null_defined_text; ``` Before Result: ```text +------+-------+ | id | name | +------+-------+ | 1 | Alice | | 2 | | +------+-------+ ``` After Result: ```text +------+-------+ | id | name | +------+-------+ | 1 | Alice | | 2 | NULL | +------+-------+ ```
…he#55626) Problem Summary: This pull request improves the handling of empty string null formats and delimiter properties for Hive external tables, ensuring more robust and consistent behavior when parsing data. For hive text table like this: ```sql CREATE TABLE test_empty_null_defined_text ( id INT, name STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' NULL DEFINED AS '' STORED AS TEXTFILE; INSERT INTO TABLE test_empty_null_defined_text VALUES (1, 'Alice'), (2, NULL); ``` Query in Doris: ```sql select * from test_empty_null_defined_text; ``` Before Result: ```text +------+-------+ | id | name | +------+-------+ | 1 | Alice | | 2 | | +------+-------+ ``` After Result: ```text +------+-------+ | id | name | +------+-------+ | 1 | Alice | | 2 | NULL | +------+-------+ ```
…he#55626) ### What problem does this PR solve? Problem Summary: This pull request improves the handling of empty string null formats and delimiter properties for Hive external tables, ensuring more robust and consistent behavior when parsing data. For hive text table like this: ```sql CREATE TABLE test_empty_null_defined_text ( id INT, name STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' NULL DEFINED AS '' STORED AS TEXTFILE; INSERT INTO TABLE test_empty_null_defined_text VALUES (1, 'Alice'), (2, NULL); ``` Query in Doris: ```sql select * from test_empty_null_defined_text; ``` Before Result: ```text +------+-------+ | id | name | +------+-------+ | 1 | Alice | | 2 | | +------+-------+ ``` After Result: ```text +------+-------+ | id | name | +------+-------+ | 1 | Alice | | 2 | NULL | +------+-------+ ```
What problem does this PR solve?
Problem Summary:
This pull request improves the handling of empty string null formats and delimiter properties for Hive external tables, ensuring more robust and consistent behavior when parsing data.
For hive text table like this:
Query in Doris:
Before Result:
After Result:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)