-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-17419: Introduce ParallelHttpShardHandler #2681
Merged
gerlowskija
merged 10 commits into
apache:main
from
gerlowskija:SOLR-17149-parallel-shard-handling
Sep 6, 2024
Merged
SOLR-17419: Introduce ParallelHttpShardHandler #2681
gerlowskija
merged 10 commits into
apache:main
from
gerlowskija:SOLR-17149-parallel-shard-handling
Sep 6, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The default ShardHandler implementation, HttpShardHandler, sends all shard-requests serially, only parallelizing the waiting and parsing of responses. This works great for collections with few shards, but as the number of shards increases the serialized sending of shard-requests adds a larger and larger overhead to the overall request (especially when auth and PKI are done at request-sending time). This commit fixes this by introducing an alternate ShardHandler implementation, geared towards collections with many shards. This ShardHandler uses an executor to parallelize both request sending and response waiting/parsing. This consumes more CPU, but reduces greatly reduces the latency/QTime observed by users querying many-shard collections. Remaining TODOs: - tests for ParallelHttpShardHandler - precommit/check - Javadocs - ref-guide docs for shard handler abstraction - test randomization for http vs parallel SH
gerlowskija
changed the title
SOLR-17149: Introduce ParallelHttpShardHandler
SOLR-17419: Introduce ParallelHttpShardHandler
Aug 30, 2024
github-actions
bot
added
the
documentation
Improvements or additions to documentation
label
Aug 30, 2024
gerlowskija
added a commit
that referenced
this pull request
Sep 6, 2024
The default ShardHandler implementation, HttpShardHandler, sends all shard-requests serially, only parallelizing the waiting and parsing of responses. This works great for collections with few shards, but as the number of shards increases the serialized sending of shard-requests adds a larger and larger overhead to the overall request (especially when auth and PKI are done at request-sending time). This commit fixes this by introducing an alternate ShardHandler implementation, geared towards collections with many shards. This ShardHandler uses an executor to parallelize both request sending and response waiting/parsing. This consumes more CPU, but reduces greatly reduces the latency/QTime observed by users querying many-shard collections.
@gerlowskija I think the Jira item is wrong (SOLR-17419 vs SOLR-17149). [EDIT] Oh actually the CHANGE file is correct. I was confused because this PR got linked to another Jira. Sorry for the noise. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://issues.apache.org/jira/browse/SOLR-17419
Description
The default ShardHandler implementation, HttpShardHandler, sends all shard-requests serially, only parallelizing the waiting and parsing of responses. This works great for collections with few shards, but as the number of shards increases the serialized sending of shard-requests adds a larger and larger overhead. This is especially stark when auth is enabled, and PKI header-generation happens at request-sending time.
Solution
This commit fixes this by introducing an alternate ShardHandler implementation, geared towards collections with many shards. This ShardHandler uses an executor to parallelize both request sending and response waiting/parsing. This consumes more CPU, but reduces greatly reduces the latency/QTime observed by users querying many-shard collections.
See the perf-test results shared on SOLR-17149 for more details!
Tests
Consolidates two related test files into the single
TestShardHandlerFactory
, and adds randomization to cover the new ParallelSHF implementation.Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.