-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Fix](catalog)Resources should be closed when dropping a Catalog. #59512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 32013 ms |
TPC-DS: Total hot run time: 173785 ms |
ClickBench: Total hot run time: 27.03 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 31365 ms |
TPC-DS: Total hot run time: 174157 ms |
ClickBench: Total hot run time: 27.04 s |
|
run external |
|
run buildall |
TPC-H: Total hot run time: 31913 ms |
TPC-DS: Total hot run time: 171533 ms |
ClickBench: Total hot run time: 26.83 s |
FE Regression Coverage ReportIncrement line coverage |
|
run cloud_p0 |
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 31383 ms |
TPC-DS: Total hot run time: 172886 ms |
ClickBench: Total hot run time: 26.8 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
TPC-H: Total hot run time: 31243 ms |
TPC-DS: Total hot run time: 172757 ms |
ClickBench: Total hot run time: 27.26 s |
FE Regression Coverage ReportIncrement line coverage |
…Catalog. (#59642) … #59512 #59063 ### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
…ache#59512) ### What problem does this PR solve? When creating an AWS SDK V2 async client (e.g. S3AsyncClient), the SDK requires a thread pool to manage asynchronous task scheduling, timeouts, and retries (e.g. ScheduledExecutorService or async executor). If no executor is explicitly provided, the AWS SDK creates its own internal thread pool, which is expected to be shut down when client.close() is invoked. ### Issue In Doris, when using Paimon Catalog, some catalog implementations provide an empty close() method. As a result, when a user executes DROP CATALOG: starting with Hadoop 3.4, the AWS SDK was upgraded to v2. Since apache#57226 upgraded HDFS to 3.4.2, the catalog runs into this issue. Some Catalog instance is discarded,AWS SDK client.close() is never called,The internally created thread pool cannot be shut down. This leads to a thread leak FYI aws/aws-sdk-java-v2#3746 ### Problem Analysis The lifecycle of Catalog creation and destruction is complex and managed internally by Paimon Doris cannot reliably intervene in the call chain to enforce AWS SDK client.close() If each Catalog creates its own AWS SDK client with an internally managed thread pool, threads will continue to leak as Catalogs are repeatedly created and dropped ### Solution This PR resolves the issue by introducing a shared executor strategy: A Doris-managed shared thread pool is explicitly passed when creating AWS SDK V2 clients This prevents the AWS SDK from implicitly creating per-client internal thread pools The lifecycle of the executor is fully controlled by Doris and no longer depends on Paimon Catalog’s close() implementation With this approach, even if a Paimon Catalog’s close() method is a no-op, the system will no longer leak threads. None
What problem does this PR solve?
When creating an AWS SDK V2 async client (e.g. S3AsyncClient), the SDK requires a thread pool to manage asynchronous task scheduling, timeouts, and retries (e.g. ScheduledExecutorService or async executor).
If no executor is explicitly provided, the AWS SDK creates its own internal thread pool, which is expected to be shut down when client.close() is invoked.
Issue
In Doris, when using Paimon Catalog, some catalog implementations provide an empty close() method. As a result, when a user executes DROP CATALOG:
starting with Hadoop 3.4, the AWS SDK was upgraded to v2. Since #57226 upgraded HDFS to 3.4.2, the catalog runs into this issue.
Some Catalog instance is discarded,AWS SDK client.close() is never called,The internally created thread pool cannot be shut down.
This leads to a thread leak
FYI aws/aws-sdk-java-v2#3746
Problem Analysis
The lifecycle of Catalog creation and destruction is complex and managed internally by Paimon
Doris cannot reliably intervene in the call chain to enforce AWS SDK client.close()
If each Catalog creates its own AWS SDK client with an internally managed thread pool, threads will continue to leak as Catalogs are repeatedly created and dropped
Solution
This PR resolves the issue by introducing a shared executor strategy:
A Doris-managed shared thread pool is explicitly passed when creating AWS SDK V2 clients
This prevents the AWS SDK from implicitly creating per-client internal thread pools
The lifecycle of the executor is fully controlled by Doris and no longer depends on Paimon Catalog’s close() implementation
With this approach, even if a Paimon Catalog’s close() method is a no-op, the system will no longer leak threads.
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)