Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a delay before reconnecting to MQTT #161

Merged
merged 2 commits into from
Sep 20, 2023
Merged

Conversation

subpop
Copy link
Collaborator

@subpop subpop commented Sep 15, 2023

Add a new hidden flag: --mqtt-reconnect-delay. This flag will sleep for the given duration before attempting to reconnect to the MQTT broker. This only works if --mqtt-auto-reconnect is enabled (it is true by default).

I also noticed that the previously added MQTT flags were not set into the DefaultConfig struct, so their values when read from the struct were always the zero value. I fixed this in a separate commit on this same PR.

Copy link
Contributor

@jirihnidek jirihnidek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that it works, but it is not clear why this new option was added and why --mqtt-connect-retry-interval is not enough. I have some comments and proposals how to improve the code.

@@ -508,6 +509,12 @@ func main() {
Value: true,
Hidden: true,
}),
altsrc.NewDurationFlag(&cli.DurationFlag{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is pretty confusing that you cannot use --mqtt-reconnect-delay with simple numeric value. I was expecting that it should work and value would be considered in seconds.

I think that we should emphasize in Usage: that argument should be something like 10s (not only 10 or "10").

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should also improve Usage text of other occurrences of DurationFlag.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally, the DurationFlag uses time.ParseDuration to parse the value. That function does not make assumptions about default duration unit if the unit is omitted. The error returned by an invalid value for the flag does not make a suggestion about how to specify a valid format. That error comes from the cli package though; we can't change it.

These are hidden flags though; their Usage won't be visible to a user in the --help output. I changed the hidden value to true to see what the usage text looks like:

--mqtt-reconnect-delay DURATION  Sets the time to wait before attempting to reconnect to DURATION (default: 0s)

The default value there hints at using a duration unit, so if we do ever make these visible flags, the usage does suggest to the user how to specify a correct value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are right, when the flag is hidden, then the usage could not visible.

@@ -99,6 +99,18 @@ func NewMQTTTransport(clientID string, brokers []string, tlsConfig *tls.Config)
t.events <- TransporterEventDisconnected
})

opts.SetReconnectingHandler(func(c mqtt.Client, co *mqtt.ClientOptions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When --mqtt-connect-retry is used, then this handler function is also used. I hope that we could improve messaging in this case. When yggd is not able to reconnect, then some exponential backoff algorithm is used for computing next attempt. It would be nice to see something like, when reconnecting failed: reconnecting failed, next reconnect attempt in X seconds. I don't see where to print such message. I also do not know if it is possible to get information about value of delay computed by exponential backoff algorithm.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, the MQTT client package does the backing off itself. We just get a handler that's called before the reconnect attempt happens. I don't think the value is exported by the package at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that the delay is printed, when MQTT_DEBUG is set.

internal/transport/mqtt.go Outdated Show resolved Hide resolved
internal/transport/mqtt.go Outdated Show resolved Hide resolved
@subpop
Copy link
Collaborator Author

subpop commented Sep 19, 2023

I can confirm that it works, but it is not clear why this new option was added and why --mqtt-connect-retry-interval is not enough. I have some comments and proposals how to improve the code.

ConnectRetryInterval is used internally by the client to delay reconnecting on initial connection only. We could assume the same value and use it in our time.Sleep inside the ReconnectingHandler, but I feel like that might be making things more confusing.

SetConnectRetryInterval sets the time that will be waited between connection attempts when initially connecting if ConnectRetry is TRUE

https://pkg.go.dev/github.com/eclipse/paho.mqtt.golang#ClientOptions.SetConnectRetryInterval

@subpop subpop force-pushed the sleep-before-reconnect branch from 4f337f4 to 888f319 Compare September 19, 2023 15:51
Make sure the values of the mqtt-* flags are pulled from the CLI context
into the DefaultConfig.

Signed-off-by: Link Dupont <link@sub-pop.net>
Add a new hidden flag: --mqtt-reconnect-delay. This flag will sleep for
the given duration before attempting to reconnect to the MQTT broker.
This only works if --mqtt-auto-reconnect is enabled (it is true by
default).

Signed-off-by: Link Dupont <link@sub-pop.net>
@subpop subpop force-pushed the sleep-before-reconnect branch from 888f319 to d15eb02 Compare September 19, 2023 15:58
Copy link
Contributor

@jirihnidek jirihnidek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for updates. 👍

@@ -508,6 +509,12 @@ func main() {
Value: true,
Hidden: true,
}),
altsrc.NewDurationFlag(&cli.DurationFlag{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are right, when the flag is hidden, then the usage could not visible.

@@ -99,6 +99,18 @@ func NewMQTTTransport(clientID string, brokers []string, tlsConfig *tls.Config)
t.events <- TransporterEventDisconnected
})

opts.SetReconnectingHandler(func(c mqtt.Client, co *mqtt.ClientOptions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that the delay is printed, when MQTT_DEBUG is set.

@jirihnidek jirihnidek merged commit 1f015e8 into main Sep 20, 2023
12 of 13 checks passed
@jirihnidek jirihnidek deleted the sleep-before-reconnect branch September 20, 2023 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants