-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Enhancement](sql-dialect) Support multiple sql-converter service urls #52636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
run buildall |
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
TPC-H: Total hot run time: 33685 ms |
TPC-DS: Total hot run time: 185849 ms |
ClickBench: Total hot run time: 29.66 s |
FE UT Coverage ReportIncrement line coverage |
239ef24 to
008b2fa
Compare
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
| private static final ConcurrentHashMap<String, UrlManager> urlManagerCache = new ConcurrentHashMap<>(); | ||
|
|
||
| // Blacklist recovery time (ms): 5 minutes | ||
| private static final long BLACKLIST_RECOVERY_TIME_MS = 5 * 60 * 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too long, 1min is enough, because the sql convertor should be restart very soon
|
|
||
| public static String convertSql(String targetURLs, String originStmt, String dialect, | ||
| String[] features, String config) { | ||
| if (targetURLs == null || targetURLs.trim().isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be done when setting the variable. See checker() in public @interface VarAttr
| BlacklistEntry existingEntry = blacklist.get(url); | ||
| if (existingEntry != null) { | ||
| // If URL is already in blacklist, limit maximum recovery time to avoid infinite extension | ||
| // Maximum recovery time is 2 times the original recovery time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need, that is too complicated. I think just return if the url is already in black list.
| } | ||
|
|
||
| // Randomly shuffle the order to avoid always trying from the first URL | ||
| Collections.shuffle(healthy, ThreadLocalRandom.current()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not shuffle. Because actually we always want to select the local one (with 127.0.0.1).
So the logic should be: Always try to select url with 127.0.0.1. If it is not available, select another one randomly.
| List<String> prioritizedUrls = Lists.newArrayList(); | ||
|
|
||
| // First: Add all healthy URLs | ||
| List<String> healthyUrls = getHealthyUrls(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You traverse the parsedUrls twice, both in getHealthyUrls() and in getBlacklistedUrls(), which is low efficiency. I think we can only traverse the list once.
- first, add
127.0.01if available. - add other available url.
- add url in black list
008b2fa to
f384cd7
Compare
| for (String url : allUrls) { | ||
| try { | ||
| String result = doConvertSql(url, requestStr); | ||
| if (result != null && !result.equals(originStmt)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why need this condition? !result.equals(originStmt)
| } | ||
| } | ||
|
|
||
| LOG.warn("All URLs failed to convert SQL, return original SQL"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this log, no useful info
|
|
||
| public static String convertSql(String targetURL, String originStmt, String dialect, | ||
| // Cache URL manager instances to avoid duplicate parsing | ||
| private static final ConcurrentHashMap<String, UrlManager> urlManagerCache = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to consider how to remove the unused entries.
So I suggest to use a Caffein Cache, and set a expire time so that it can remote unused entries automatically.
f384cd7 to
9c86e78
Compare
|
run buildall |
TPC-H: Total hot run time: 33339 ms |
TPC-DS: Total hot run time: 188004 ms |
ClickBench: Total hot run time: 29.77 s |
FE UT Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
kaka11chen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
#52636) ### What problem does this PR solve? Problem Summary: **Set up two service addresses** ``` Doris > set global sql_converter_service_url = 'http://127.0.0.1:5001/api/v1/convert,http://127.0.0.1:5002/api/v1/convert'; Query OK, 0 rows affected (0.04 sec) Doris > set sql_dialect = 'clickhouse'; Query OK, 0 rows affected (0.05 sec) Doris > SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr); +--------------------+ | ELEMENT_AT(arr, 2) | +--------------------+ | 2 | +--------------------+ 1 row in set (0.11 sec) ``` **Stop one of the services** ``` Doris > SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr); +--------------------+ | ELEMENT_AT(arr, 2) | +--------------------+ | 2 | +--------------------+ 1 row in set (0.14 sec) ``` **Stopping all services resulted in expected errors** ``` Doris > SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr); ERROR 1105 (HY000): errCode = 2, detailMessage = Every derived table must have its own alias(line 1, pos 33) == SQL == SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr) ---------------------------------^^^ Doris > SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr); ERROR 1105 (HY000): errCode = 2, detailMessage = Every derived table must have its own alias(line 1, pos 33) == SQL == SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr) ---------------------------------^^^ ``` **Restart one of the services** ``` Doris > SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr); +--------------------+ | ELEMENT_AT(arr, 2) | +--------------------+ | 2 | +--------------------+ 1 row in set (0.16 sec) ```
#52636) ### What problem does this PR solve? Problem Summary: **Set up two service addresses** ``` Doris > set global sql_converter_service_url = 'http://127.0.0.1:5001/api/v1/convert,http://127.0.0.1:5002/api/v1/convert'; Query OK, 0 rows affected (0.04 sec) Doris > set sql_dialect = 'clickhouse'; Query OK, 0 rows affected (0.05 sec) Doris > SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr); +--------------------+ | ELEMENT_AT(arr, 2) | +--------------------+ | 2 | +--------------------+ 1 row in set (0.11 sec) ``` **Stop one of the services** ``` Doris > SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr); +--------------------+ | ELEMENT_AT(arr, 2) | +--------------------+ | 2 | +--------------------+ 1 row in set (0.14 sec) ``` **Stopping all services resulted in expected errors** ``` Doris > SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr); ERROR 1105 (HY000): errCode = 2, detailMessage = Every derived table must have its own alias(line 1, pos 33) == SQL == SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr) ---------------------------------^^^ Doris > SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr); ERROR 1105 (HY000): errCode = 2, detailMessage = Every derived table must have its own alias(line 1, pos 33) == SQL == SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr) ---------------------------------^^^ ``` **Restart one of the services** ``` Doris > SELECT arrayElement(arr, 2) FROM (SELECT [1, 2, 3] AS arr); +--------------------+ | ELEMENT_AT(arr, 2) | +--------------------+ | 2 | +--------------------+ 1 row in set (0.16 sec) ```
…vice urls (apache#52636)" This reverts commit 8c3c54e.
…vice urls (apache#52636)" This reverts commit 8c3c54e.
…vice urls (apache#52636)" (apache#59610) This may cause sql convertor read timeout
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Set up two service addresses
Stop one of the services
Stopping all services resulted in expected errors
Restart one of the services
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)