You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get some strange behaviour when I stop and restart my broker while there is an active Tortoise connection. I use Tortoise 0.9.4 on a MBP running Macos 10.15.3 (Catalina) and Elixir 1.10.0 on top of Erlang 22.2.4 (both installed via brew).
This is how the connection is set up (As a child of the app supervisor:
I run a producer and a consumer in separate processes. The producer adds messages to a queue (BlockingQueue) , and the consumer takes the messages from the queue and publishes them via publish_sync. I tried with publish and a receive..after, but I got even worse problems (some of my messages were lost).
Here is what happens:
When the broker is stopped, I immediately get a warning from the logger:
14:51:38.193 [warn] Connection has been dropped
I can keep publishing (for minutes even) , but as soon as I restart the broker, I get an error:
After 4 errors, the Tortoise client is terminated (by or via the supervisor), and a new one is created, after which everything is fine again.
I have lost no messages, but the messages that was being published at the time the broker is restarted gets published more than once.
The question is whether this is a bug in Tortoise, or am I doing something wrong? Or maybe this is what is supposed to happen and is a good example of the let it fail philosophy?
The text was updated successfully, but these errors were encountered:
Hello Martin,
Thank you for your fast response.
I am running mosquitto on my Mac. It is the latest version, installed via brew.
Kind regards,
Hein
On 31 January 2020 at 18:04:52, Martin Gausby (notifications@github.com) wrote:
Looks like the server send a package with an identifier unknown to the tortoise instance…what broker are you using?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Hi,
I get some strange behaviour when I stop and restart my broker while there is an active Tortoise connection. I use Tortoise 0.9.4 on a MBP running Macos 10.15.3 (Catalina) and Elixir 1.10.0 on top of Erlang 22.2.4 (both installed via brew).
This is how the connection is set up (As a child of the app supervisor:
{Tortoise.Connection,
[
client_id: "hein-1",
server: {
Tortoise.Transport.Tcp,
host: "localhost", port: 1883
},
handler: {Tortoise.Handler.Logger, []}
]}
I run a producer and a consumer in separate processes. The producer adds messages to a queue (BlockingQueue) , and the consumer takes the messages from the queue and publishes them via publish_sync. I tried with publish and a receive..after, but I got even worse problems (some of my messages were lost).
Here is what happens:
When the broker is stopped, I immediately get a warning from the logger:
14:51:38.193 [warn] Connection has been dropped
I can keep publishing (for minutes even) , but as soon as I restart the broker, I get an error:
14:51:50.926 [error] GenStateMachine {Tortoise.Registry, {Tortoise.Connection.Inflight, "hein-1"}} terminating
** (exit) {:protocol_violation, :unknown_identifier}
(stdlib 3.11.1) gen_statem.erl:1298: :gen_statem.loop_state_callback_result/11
(stdlib 3.11.1) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
State: {{:connected, {Tortoise.Transport.Tcp, #Port<0.16>}}, %Tortoise.Connection.Inflight{client_id: "hein-1", order: [], pending: %{}}}
Callback mode: :handle_event_function
After the timeout on publish_sync I get this:
14:52:50.914 [info] answer = {:error, :timeout}
And I try to publish the message again:
14:52:50.914 [info] publish message Message 4
But get another error:
14:52:50.922 [error] GenStateMachine {Tortoise.Registry, {Tortoise.Connection.Inflight, "hein-1"}} terminating
** (CaseClauseError) no case clause matching: {:error, :closed}
(tortoise 0.9.4) lib/tortoise/connection/inflight.ex:292: Tortoise.Connection.Inflight.handle_event/4
(stdlib 3.11.1) gen_statem.erl:1161: :gen_statem.loop_state_callback/11
(stdlib 3.11.1) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Then I get this warning:
14:52:55.912 [warn] Client has been terminated with reason: {:function_clause, [{Tortoise.Handler.Logger, :connection, [:terminated, %Tortoise.Handler.Logger{}], [file: 'lib/tortoise/handler/logger.ex', line: 16]}, {Tortoise.Handler, :execute, 2, [file: 'lib/tortoise/handler.ex', line: 314]}, {Tortoise.Connection.Controller, :handle_info, 2, [file: 'lib/tortoise/connection/controller.ex', line: 215]}, {:gen_server, :try_dispatch, 4, [file: 'gen_server.erl', line: 637]}, {:gen_server, :handle_msg, 6, [file: 'gen_server.erl', line: 711]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]}
And this error:
14:52:55.920 [error] GenServer {Tortoise.Registry, {Tortoise.Connection.Controller, "hein-1"}} terminating
** (FunctionClauseError) no function clause matching in Tortoise.Handler.Logger.connection/2
(tortoise 0.9.4) lib/tortoise/handler/logger.ex:16: Tortoise.Handler.Logger.connection(:terminated, %Tortoise.Handler.Logger{})
(tortoise 0.9.4) lib/tortoise/handler.ex:314: Tortoise.Handler.execute/2
(tortoise 0.9.4) lib/tortoise/connection/controller.ex:215: Tortoise.Connection.Controller.handle_info/2
Then comes a ping timeout:
14:52:55.922 [error] GenServer {Tortoise.Registry, {Tortoise.Connection, "hein-1"}} terminating
** (stop) :ping_timeout
Last message: :ping
State: %Tortoise.Connection{backoff: %Tortoise.Connection.Backoff{max_interval: 30000, min_interval: 100, value: nil}, client_id: "hein-1", connect: %Tortoise.Package.Connect{META: %Tortoise.Package.Meta{flags: 0, opcode: 1}, clean_session: false, client
After 4 errors, the Tortoise client is terminated (by or via the supervisor), and a new one is created, after which everything is fine again.
I have lost no messages, but the messages that was being published at the time the broker is restarted gets published more than once.
The question is whether this is a bug in Tortoise, or am I doing something wrong? Or maybe this is what is supposed to happen and is a good example of the let it fail philosophy?
The text was updated successfully, but these errors were encountered: