-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-16927 Allow SolrClientCache clients to use Jetty HTTP2 clients #1835
SOLR-16927 Allow SolrClientCache clients to use Jetty HTTP2 clients #1835
Conversation
looking for a review on this. @dsmiley, @epugh @joel-bernstein I hope you don't mind I added you. looking forward to your thoughts! |
@@ -55,6 +55,7 @@ public class CommitStream extends TupleStream implements Expressible { | |||
private TupleStream tupleSource; | |||
|
|||
private transient SolrClientCache clientCache; | |||
private transient boolean isCloseCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this new boolean isCloseCache
? Can't we just check if clientCache == null
in the close method and set it to clientCache = null
after clientCache.close();
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right. there are 2 patterns of doing this in the PR (flag vs null check). I will switch to this one, it's cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry @risdenk I misread your question. in this case no, you can't get rid of the flag because if the cache is not null you don't know if it came from the context (via setStreamContext method) or it was created locally (in the open method). if it came from the context it's reused for multiple streams so you can't close it here, you have to close it at the parent location where it was created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name "isCloseCache" is clumsy IMO; I think "doCloseCache" is far clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/SolrClientCache.java
Outdated
Show resolved
Hide resolved
public class SolrClientCache implements Serializable { | ||
public class SolrClientCache implements Serializable, Closeable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a subclass approach would work better, such that LegacySolrClientCache (proposed name) would override production of a new SolrClient? After all, we have multiple types of SolrClient implementations without having a single one trying to work with both HttpClient types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are touching this, do we really need to implement Serializable?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would removing it and seeing if any test fails be sufficient as a check if we can do this or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a subclass approach would work better
I am not convinced. the complexity comes from the streaming code that has a lot of ins and outs. ideally it would work both types of clients irrespective of where they come from (internal node communication or external apps). I don't think the code would be as clean, but I can definitely give it a go and see what fits better.
solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/SolrClientCache.java
Outdated
Show resolved
Hide resolved
.withSocketTimeout(30000, TimeUnit.MILLISECONDS) | ||
.withConnectionTimeout(15000, TimeUnit.MILLISECONDS) | ||
.build(); | ||
cloudSolrClient = SolrClientCache.newCloudHttp2SolrClient(zkHost, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is curious... it seems you have made a public static method on SolrClientCache for creating a CloudHttp2SolrClient, thus making this class a builder for anyone anywhere who wants such a client? Shouldn't we let the builder of CloudSolrClient do such a job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I agree with your concern. but I added it to replace all manual builder use from across the streaming module. there is no way to define visibility to module only, so this is what I ended up with. I prefer this single point of entry because it handles the timeout setting correctly, rather than repeating this code all over. the inconsistency was that some of the builders have increased timeouts, some don't, it's hard to understand what goes where in the streaming module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking about it a bit more, we can transition all streaming code to use 'client cache' instances instead of 'http clients' directly and this would allow all the builder code to be in one place with no public static methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think that's best.
And if there is timeout initialization handling in SolrClientCache, I wonder why it's there; why isn't it in the builder of CloudSolrClient etc.? SCC should just be a cache IMO. Nothing much to it. CC @joel-bernstein
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way I read it, this class attempts to set a min timeout(~60s) irrespective of what was set on client originally ... and also in passing manages to reset the timeouts back to 60s if they are bigger (which is an unfortunate side effect I wanted to address with this change) - I'm thinking the default httpclient on the UpdateShardHandler might play a central role here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsmiley did a complete pass over all classes and moved everything to cache. please take a look.
solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java
Show resolved
Hide resolved
@joel-bernstein do you think you will have some time to review this PR? |
4d5a9b8
to
32847e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than a little rename, looks good.
Please don't force-push!
My concerns about the necessity of the timeouts can be deferred -- "progress not perfection".
@@ -55,6 +55,7 @@ public class CommitStream extends TupleStream implements Expressible { | |||
private TupleStream tupleSource; | |||
|
|||
private transient SolrClientCache clientCache; | |||
private transient boolean isCloseCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name "isCloseCache" is clumsy IMO; I think "doCloseCache" is far clearer.
sounds good. will rename and merge in the next few days. thanks @dsmiley! |
405c586
to
db15bc0
Compare
https://issues.apache.org/jira/browse/SOLR-16927
Description
Allow SolrClientCache to work with Jetty http2 clients.
Solution
This change is allowing the SolrClientCache users to use a Jetty http2 client, in parallel with the internal Solr http client.
All http client creation (both old and new) is moved into the SolrClientCache, this will make it a single source for all the streaming code. It will also fallback to Jetty http2 by default.
I also poked around the timeouts a bit, the current values based on system properties will overwrite the internal http client ones, which is bad when the client values are larger than the local ones.
I had to touch some of the streaming classes, as the http2 client needs to be properly closed, which was not the case everywhere.
I added a benchmark, but I still need to tweak it a bit before publishing numbers.
Just for future reference of what remains to be done in this class, most of the internal use of this cache is based on the UpdateShardHandler.getDefaultHttpClient (SOLR-16503) - the CoreContainer reference, so the complete refactoring is blocked until that transition completes.
Tests
Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem.
Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.