Skip to content

Propagate socketProvider.get() exception to a connection state #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Joyfolk opened this issue Feb 27, 2018 · 7 comments · Fixed by #221
Closed

Propagate socketProvider.get() exception to a connection state #30

Joyfolk opened this issue Feb 27, 2018 · 7 comments · Fixed by #221
Assignees

Comments

@Joyfolk
Copy link

Joyfolk commented Feb 27, 2018

channel = socketProvider.get(retry++, lastError == NOT_INIT_EXCEPTION ? null : lastError);

It's better to move this line inside try/catch block.

@Totktonada
Copy link
Member

Now we stick with this behaviour as part of the SocketChannelProvider contract. From README.md:

The TarantoolClient will stop functioning if your implementation of a socket channel provider raises an exception or returns a null. You will need a new instance of client to recover. Hence, you should only throw in case you have met unrecoverable error.

Don't sure whether this behaviour was intentional when it was created, but it has sense for me: only logic inside SocketChannelProvider.get() can distinguish recoverable errors from fatal ones. TarantoolClientImpl can not do it and would retry attempts to get a socket is case of any error even if it is something unexpected.

I don't think we should change this contract. A user can wrap a get code with a try-catch block and perform retry in case, say, IOException, but other exceptions will lead to explicit fail (maybe we also need to provide this exception as a CommonucationException cause).

@Joyfolk Can you share your opinion on that?

@Joyfolk
Copy link
Author

Joyfolk commented Mar 14, 2019

@Totktonada It is a good point about the contract. But any uncatched exception wouldn't lead to explicit fail of the application. It leads only to termination of the reconnection thread without any notification to the rest of the application and to hard-to-debug errors in the client code. I think it would be better to set connection state as dead with correct error information.

@Totktonada Totktonada changed the title Reconnect thread would stop if socketProvider.get throws an exception Propagate socketProvider.get() exception to a connection state Mar 15, 2019
@Totktonada
Copy link
Member

Agreed. I rephrased the issue header.

@nicktorwald
Copy link

I think it would be better to set connection state as dead with correct error information.

@Joyfolk Please take a look at this change of the client.
Has it already done in this commit?

@Totktonada
Copy link
Member

Related: PR #170.

@Totktonada
Copy link
Member

When a client is unable to connect during its creation, we have no object to call getThumbstone(). The exception that is raised in the case is about timeout. We should add a real cause to it: one that we got from a socket channel provider. I think the behaviour should be more or less the following:

diff --git a/src/test/java/org/tarantool/ClientReconnectIT.java b/src/test/java/org/tarantool/ClientReconnectIT.java
index f644fc8..5cb7e53 100644
--- a/src/test/java/org/tarantool/ClientReconnectIT.java
+++ b/src/test/java/org/tarantool/ClientReconnectIT.java
@@ -16,6 +16,7 @@ import org.junit.jupiter.api.BeforeEach;
 import org.junit.jupiter.api.Test;
 import org.junit.jupiter.api.function.Executable;
 
+import java.net.ConnectException;
 import java.nio.channels.SocketChannel;
 import java.time.Duration;
 import java.util.Collections;
@@ -422,4 +423,23 @@ public class ClientReconnectIT {
         return new TarantoolClientImpl(provider, makeDefaultClientConfig());
     }
 
+    @Test
+    public void testConnectionRefusedException() {
+        testHelper.stopInstance();
+
+        TarantoolClientConfig config = makeDefaultClientConfig();
+        config.initTimeoutMillis = 100;
+        CommunicationException e = assertThrows(CommunicationException.class,
+            () -> new TarantoolClientImpl(socketChannelProvider, config));
+
+        assertEquals(e.getMessage(), "100ms is exceeded when waiting " +
+            "for client initialization. You could configure init timeout " +
+            "in TarantoolConfig");
+        Throwable cause = e.getCause();
+        assertTrue(ConnectException.class.isAssignableFrom(cause.getClass()));
+        assertEquals(cause.getMessage(), "Connection refused");
+
+        testHelper.startInstance();
+    }
+
 }

(Maybe it is better to mock SingleSocketChannelProviderImpl or implement a socket provider that gives an error instead of starting / stopping an instance.)

Our customer recently met a connection problem and it is hard to debug it without a real cause. The only workaround now is to implement own socket channel provider and log a problem from it.

We also need to ensure (write a test) that a real cause is in thombstone when a reconnection fails after successful connect.

@Totktonada
Copy link
Member

I think that the issue can be closed when those things will work as described above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants