-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #929: Perform async DNS resolution each time attempting connect… #930
Conversation
…onnection to service Url Also attempt to connecto to all IP addresses present in DNS record
LGTM |
* working | ||
*/ | ||
private CompletableFuture<Channel> connectToResolvedAddresses(Iterator<InetAddress> unresolvedAddresses, int port) { | ||
CompletableFuture<Channel> future = new CompletableFuture<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. 👍
If the good address is last in the iterator, instead of iterating to resolve every time, should we add some sort of affinity to reverse order? That'll complicate the logic though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is that DNS servers typically do that rotation of IPs already, for load balancing.
In any case, right now we're just trying to 1st IP :) (and then we stick to it...)
@rdhabalia Please take a look. Also, since the 1.21 RC is still on hold, should we consider fix this in 1.21? |
If it's not urgent then can we take in next one else I will finish review today. |
It's a bit broken when doing DNS based service discovery... :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGM.. just minor comments.
@@ -101,6 +111,9 @@ public void initChannel(SocketChannel ch) throws Exception { | |||
ch.pipeline().addLast("handler", new ClientCnx(client)); | |||
} | |||
}); | |||
|
|||
this.dnsResolver = new DnsNameResolverBuilder(eventLoopGroup.next()) | |||
.channelType(EventLoopUtil.getDatagramChannelClass(eventLoopGroup)).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add queryTimeoutMillis(2000)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and also traceEnabled(true)
that can help for any resolution failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already a default timeout of 5secs. We can tune it down if desired
CompletableFuture<List<InetAddress>> future = new CompletableFuture<>(); | ||
dnsResolver.resolveAll(hostname).addListener((Future<List<InetAddress>> resolveFuture) -> { | ||
if (resolveFuture.isSuccess()) { | ||
future.complete(resolveFuture.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible that resolveFuture.get()
returns null incase it doesn't find or it will not return successful future when it doesn't find?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if the dns operation completes successfully, it should always carry at least 1 IP address
@@ -195,14 +203,91 @@ public void initChannel(SocketChannel ch) throws Exception { | |||
cnx.ctx().close(); | |||
return null; | |||
}); | |||
}).exceptionally(exception -> { | |||
log.warn("Failed to open connection to {} : {}", physicalAddress, exception.getClass().getSimpleName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any specific reason to log exception.getClass().getSimpleName()
instead message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had that already in the existing code. It was to just use "ConnectionError" type names I think
It would be better if we can add at least one test case.? Does it make sense to create unit test for |
Good point for mocking resolve name, will add that |
@rdhabalia Added tests with mocked DNS resolution |
ping @rdhabalia |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 LGTM.. just a minor question regarding:
adding traceEnabled(true)
at ConnectionPool:116 will be helpful in case of debugging?
…ion to service Url
Also attempt to connect to all IP addresses present in DNS record
Context in #929
I'm not sure yet how to properly unit test this change, suggestions welcome.