-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure creating a subscription on a (fairly) new connection #316
Comments
Here's test code that will reproduce this issue. Maybe I'm doing something wrong... If I comment out the close in the callback, it works. That being said, I don't see how close on one connection should affect another. Code static public void main(String argv[]) {
try {
// setup connection
Connection nc = Nats.connect("localhost:4222");
// setup a message handler that creates another connection
// and closes the original connection.
MessageHandler mh = new MessageHandler() {
public void onMessage(Message msg) throws InterruptedException {
try {
// create a new connection here
Connection nc2 = Nats.connect("localhost:4222");
// close the original conn
try {
nc.close();
} catch (Exception e) {
// Doesn't get hit
e.printStackTrace();
}
// recreate subscription on the new connection
Dispatcher d2 = nc2.createDispatcher(this);
d2.subscribe("foo"); // <- exception here
} catch (Exception e) {
e.printStackTrace();
}
}
};
// setup handler and subscription
Dispatcher d1 = nc.createDispatcher(mh);
d1 .subscribe("foo");
nc.flush(Duration.ofSeconds(2));
// Invoke the callback from a completely different connection
Nats.connect("localhost:4222").publish("foo", new byte[64]);
Thread.sleep(2000);
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("Done");
} Output
|
If this is only the latest version it could be related to the pr that put a max size on the queue.On May 28, 2020 12:15 PM, Colin Sullivan <notifications@github.com> wrote:
Here's test code that will reproduce this issue. Maybe I'm doing something wrong... If I comment out the close in the callback, it works. That being said, I don't see how close on one connection should affect another.
|
Looks like it was introduced in 2.6.7...
|
looks like you an Richard worked on that one, maybe he has some ideas, i will try to look if you don't see anything |
I think I've found out the issue, or at least have a working theory. The initial connection's close from within the callback will interrupt the callback thread itself, so the first "interruptable" API after the close throws an InterruptedException. In the example provided it is nats.java/src/main/java/io/nats/client/impl/MessageQueue.java Lines 144 to 151 in 507490d
In the example code if called To resolve, if a connection is closed, should user code in the callback be interrupted, or should the callback be allowed to finish? @RichardHightower, @sasbury , wdyt? |
Hm, is the thread interrupted? If so, I don’t recall ever writing that, interrupting threads is generally bad. I would say that the callback should complete.
… On Jun 8, 2020, at 3:38 PM, Colin Sullivan ***@***.***> wrote:
I think I've found out the issue, or at least have a working theory. The initial connection's close from within the callback will interrupt the callback thread itself, so the first "interruptable" API after the close throws an InterruptedException. In the example provided it is queue.offer found in subscribe:
https://github.com/nats-io/nats.java/blob/507490d9941a4dc5fe8333ad1931bd14818a66d2/src/main/java/io/nats/client/impl/MessageQueue.java#L144-L151 <https://github.com/nats-io/nats.java/blob/507490d9941a4dc5fe8333ad1931bd14818a66d2/src/main/java/io/nats/client/impl/MessageQueue.java#L144-L151>
offer was throwing an interrupted exception, so the method returned false. Calling code assumed the queue was full upon a false return code.
In the example code if called Thread.sleep(1) immediately after the close, the sleep is interrupted instead.
To resolve, if a connection is closed, should user code in the callback be interrupted, or should the callback be allowed to finish?
@RichardHightower <https://github.com/RichardHightower>, @sasbury <https://github.com/sasbury> , wdyt?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#316 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZIPUNOYZ7B3XYRICGTRPLRVVR7BANCNFSM4NNIZSLQ>.
|
I will reproduce it locally today and try to reason on it a bit. |
Related: #305 |
Was there ever a "fix" for this? I have inherited a project that I am attempting to retrofit with Nats. I am encountering the same bug/message due to the way the original code was written. There are times when the current code will "timeout" in a function and does a retry which calls the section of code again with the .publish() call which then throws the Thanks! |
Yes, I came to the same conclusion. Connection 1 subscribes to a message and then receives that message. While processing the receive, the code tries to close the connection that it is currently on, which makes sure that there are no messages in flight. But there are, so it can't close, so it times out, throws an interrupted exception, which basically leaves the connection in a broken state. This is why closing after the subscribe works and closing before the subscribe fails. So here is a fix. Make the second subscribe in a different thread.
|
Closing, not response. Fixed with code example in comments |
I'm encountering an exception when creating a subscription on a new connection. The new connection has only had a few messages passed through it based on the message stats. I'm working on isolating a use case and will add to this issue if successful.
The goal here is to have an application receive a message, and from within the callback create a new connection and subscribe. The connection used here and the attempt to subscribe were created/called from within the callback of a different connection.
What's odd is that that if the connection has never been used to publish a message this does not occur.
The result:
The problem appears to be in adding to an empty LinkedBlockingQueue.
nats.java/src/main/java/io/nats/client/impl/MessageQueue.java
Lines 121 to 123 in 507490d
Environment
MacOS Catalina
$ java -version
java version "1.8.0_231"
Java(TM) SE Runtime Environment (build 1.8.0_231-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.231-b11, mixed mode)
Any ideas? Thanks!
The text was updated successfully, but these errors were encountered: