-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement](load) http load using SQL #21621
Conversation
The current implementation is to use http requests to submit an sql to be.
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
very great feature. I want to implement for a long time. |
Is insert into db.table select expr1,expr2 from http() better? |
Yes, I have done some basic work now, and I will improve it |
clang-tidy review says "All clean, LGTM! 👍" |
clang-tidy review says "All clean, LGTM! 👍" |
Conflicts: be/src/vec/exec/format/csv/csv_reader.cpp fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does select col1+col2, col2 from http() works?
DEFINE_COUNTER_METRIC_PROTOTYPE_2ARG(http_load_duration_ms, MetricUnit::MILLISECONDS); | ||
DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(http_load_current_processing, MetricUnit::REQUESTS); | ||
|
||
void HttpLoadAction::_parse_format(const std::string& format_str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate with code in stream_load.cpp we should refactor it. BTW, please add ut.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. PR: #22304
30: optional string delete_condition // delete | ||
31: optional string hidden_columns | ||
32: optional bool trim_double_quotes // trim double quotes for csv | ||
33: optional i32 skip_lines // csv skip line num, only used when csv header_type is not set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should classify these options, like scanner_options, etc. @TangSiyang2001 what is your opinion?
params.setHiddenColumns(paramMap.get("hidden_columns")); | ||
params.setTrimDoubleQuotes(Boolean.valueOf(paramMap.getOrDefault("trim_double_quotes", "false"))); | ||
params.setSkipLines(Integer.valueOf(paramMap.getOrDefault("skip_lines", "0"))); | ||
params.setPartialColumns(Boolean.valueOf(paramMap.getOrDefault("partial_columns", "false"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should write these code much more beauty.
8: optional string partitions | ||
9: optional string temporary_partitions | ||
10: optional string columns | ||
11: required string format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not use required
This PR aims to implement a new http load. like #21172
For this HTTP load, you can use SQL to encapsulate the parameters for more convenient use.
userguide
curl -v --location-trusted -u user1:password -H "sql: 'sql string'" -T example.csv http://127.0.0.1:8030/api/v2/_load
sql string
INSERT INTO db1.table1 select * from http(format="csv",xxxx,xxxx) where t1 > 10;
example:
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...