Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-16927 Allow SolrClientCache clients to use Jetty HTTP2 clients #1835

Merged

Conversation

stillalex
Copy link
Member

@stillalex stillalex commented Aug 10, 2023

https://issues.apache.org/jira/browse/SOLR-16927

Description

Allow SolrClientCache to work with Jetty http2 clients.

Solution

This change is allowing the SolrClientCache users to use a Jetty http2 client, in parallel with the internal Solr http client.

All http client creation (both old and new) is moved into the SolrClientCache, this will make it a single source for all the streaming code. It will also fallback to Jetty http2 by default.
I also poked around the timeouts a bit, the current values based on system properties will overwrite the internal http client ones, which is bad when the client values are larger than the local ones.

I had to touch some of the streaming classes, as the http2 client needs to be properly closed, which was not the case everywhere.

I added a benchmark, but I still need to tweak it a bit before publishing numbers.

Just for future reference of what remains to be done in this class, most of the internal use of this cache is based on the UpdateShardHandler.getDefaultHttpClient (SOLR-16503) - the CoreContainer reference, so the complete refactoring is blocked until that transition completes.

Tests

Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

@stillalex
Copy link
Member Author

looking for a review on this. @dsmiley, @epugh @joel-bernstein I hope you don't mind I added you. looking forward to your thoughts!

@@ -55,6 +55,7 @@ public class CommitStream extends TupleStream implements Expressible {
private TupleStream tupleSource;

private transient SolrClientCache clientCache;
private transient boolean isCloseCache;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this new boolean isCloseCache? Can't we just check if clientCache == null in the close method and set it to clientCache = null after clientCache.close();?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. there are 2 patterns of doing this in the PR (flag vs null check). I will switch to this one, it's cleaner.

Copy link
Member Author

@stillalex stillalex Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry @risdenk I misread your question. in this case no, you can't get rid of the flag because if the cache is not null you don't know if it came from the context (via setStreamContext method) or it was created locally (in the open method). if it came from the context it's reused for multiple streams so you can't close it here, you have to close it at the parent location where it was created.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name "isCloseCache" is clumsy IMO; I think "doCloseCache" is far clearer.

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Comment on lines 43 to 46
public class SolrClientCache implements Serializable {
public class SolrClientCache implements Serializable, Closeable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a subclass approach would work better, such that LegacySolrClientCache (proposed name) would override production of a new SolrClient? After all, we have multiple types of SolrClient implementations without having a single one trying to work with both HttpClient types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are touching this, do we really need to implement Serializable?!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would removing it and seeing if any test fails be sufficient as a check if we can do this or not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a subclass approach would work better

I am not convinced. the complexity comes from the streaming code that has a lot of ins and outs. ideally it would work both types of clients irrespective of where they come from (internal node communication or external apps). I don't think the code would be as clean, but I can definitely give it a go and see what fits better.

.withSocketTimeout(30000, TimeUnit.MILLISECONDS)
.withConnectionTimeout(15000, TimeUnit.MILLISECONDS)
.build();
cloudSolrClient = SolrClientCache.newCloudHttp2SolrClient(zkHost, null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is curious... it seems you have made a public static method on SolrClientCache for creating a CloudHttp2SolrClient, thus making this class a builder for anyone anywhere who wants such a client? Shouldn't we let the builder of CloudSolrClient do such a job?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I agree with your concern. but I added it to replace all manual builder use from across the streaming module. there is no way to define visibility to module only, so this is what I ended up with. I prefer this single point of entry because it handles the timeout setting correctly, rather than repeating this code all over. the inconsistency was that some of the builders have increased timeouts, some don't, it's hard to understand what goes where in the streaming module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about it a bit more, we can transition all streaming code to use 'client cache' instances instead of 'http clients' directly and this would allow all the builder code to be in one place with no public static methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that's best.
And if there is timeout initialization handling in SolrClientCache, I wonder why it's there; why isn't it in the builder of CloudSolrClient etc.? SCC should just be a cache IMO. Nothing much to it. CC @joel-bernstein

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way I read it, this class attempts to set a min timeout(~60s) irrespective of what was set on client originally ... and also in passing manages to reset the timeouts back to 60s if they are bigger (which is an unfortunate side effect I wanted to address with this change) - I'm thinking the default httpclient on the UpdateShardHandler might play a central role here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsmiley did a complete pass over all classes and moved everything to cache. please take a look.

@stillalex stillalex marked this pull request as ready for review August 12, 2023 18:18
@stillalex
Copy link
Member Author

@joel-bernstein do you think you will have some time to review this PR?

@stillalex stillalex force-pushed the SOLR-16927-solrclientcache-clients-http2 branch from 4d5a9b8 to 32847e8 Compare August 16, 2023 21:05
Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a little rename, looks good.
Please don't force-push!
My concerns about the necessity of the timeouts can be deferred -- "progress not perfection".

@@ -55,6 +55,7 @@ public class CommitStream extends TupleStream implements Expressible {
private TupleStream tupleSource;

private transient SolrClientCache clientCache;
private transient boolean isCloseCache;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name "isCloseCache" is clumsy IMO; I think "doCloseCache" is far clearer.

@stillalex
Copy link
Member Author

sounds good. will rename and merge in the next few days. thanks @dsmiley!

@stillalex stillalex force-pushed the SOLR-16927-solrclientcache-clients-http2 branch from 405c586 to db15bc0 Compare August 21, 2023 12:23
@stillalex stillalex merged commit f6ef54a into apache:main Aug 21, 2023
3 checks passed
@stillalex stillalex deleted the SOLR-16927-solrclientcache-clients-http2 branch August 21, 2023 12:49
stillalex added a commit that referenced this pull request Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants