Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add temporarily invalid connection state to Known Issues #668

Closed
rbino opened this issue Mar 17, 2022 · 1 comment · Fixed by astarte-platform/astarte_vmq_plugin#64
Closed
Labels
user experience This issue is about user experience

Comments

@rbino
Copy link
Collaborator

rbino commented Mar 17, 2022

VerneMQ does not guarantee the order of hooks due to the distributed nature of the system (see vernemq/vernemq#1741 and vernemq/vernemq#1725).

This means that in some very rare corner cases (i.e. when the disconnection and reconnection of a device are closer than ~10ms, which usually happens only when two clients reuse the same client id) the connection event is emitted before the disconnection event, leading to an inconsistent state (i.e. the device is actually connected and publishing but appears as connected: false in the APIs).

The state eventually converges to the correct one as soon as the first heartbeat from the device is received (by default, after 1 hour), but we should document this in the Known Issue section of the documentation to help users who run into this.

@rbino
Copy link
Collaborator Author

rbino commented Apr 1, 2022

Additional notes: when the status converges using the heartbeat, currently we don't fire connection triggers, so external application which are feeded by triggers do not receive the updated state.
Note that the device can actually be publishing while it's mark as disconnected, so maybe we should check the connection also in handle_data and similar functions and possibly update it (firing the relevant triggers).

@rbino rbino added documentation This issue or pull request is about documentation user experience This issue is about user experience and removed documentation This issue or pull request is about documentation labels Apr 1, 2022
Annopaolo added a commit to Annopaolo/astarte_vmq_plugin that referenced this issue Sep 26, 2022

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
In rare corner cases (time difference < 10 ms), the `on_register` hook
(reconnection) may be called before the related `on_client_offline/gone`
hook (disconnection). This is due to the distributed nature of VerneMQ
(see vernemq/vernemq#1741), and results in an
invalid device connection state in Astarte: a device may be publishing
while being in a disconnected status.
Introduce a check on the order of disconnection/reconnection events, and
possibly reorder them if such a corner case happens.
Fix astarte-platform/astarte#668.

Signed-off-by: Arnaldo Cesco <arnaldo.cesco@secomind.com>
Annopaolo added a commit to Annopaolo/astarte_vmq_plugin that referenced this issue Sep 26, 2022
In rare corner cases (time difference < 10 ms), the `on_register` hook
(reconnection) may be called before the related `on_client_offline/gone`
hook (disconnection). This is due to the distributed nature of VerneMQ
(see vernemq/vernemq#1741), and results in an
invalid device connection state in Astarte: a device may be publishing
while being in a disconnected status.
Introduce a check on the order of disconnection/reconnection events, and
possibly reorder them if such a corner case happens.
Fix astarte-platform/astarte#668.

Signed-off-by: Arnaldo Cesco <arnaldo.cesco@secomind.com>
Annopaolo added a commit to Annopaolo/astarte_vmq_plugin that referenced this issue Sep 26, 2022
In rare corner cases (time difference < 10 ms), the `on_register` hook
(reconnection) may be called before the related `on_client_offline/gone`
hook (disconnection). This is due to the distributed nature of VerneMQ
(see vernemq/vernemq#1741), and results in an
invalid device connection state in Astarte: a device may be publishing
while being in a disconnected status.
Introduce a check on the order of disconnection/reconnection events, and
possibly reorder them if such a corner case happens.
Fix astarte-platform/astarte#668.

Signed-off-by: Arnaldo Cesco <arnaldo.cesco@secomind.com>
Annopaolo added a commit to Annopaolo/astarte_vmq_plugin that referenced this issue Sep 26, 2022
In rare corner cases (time difference < 10 ms), the `on_register` hook
(reconnection) may be called before the related `on_client_offline/gone`
hook (disconnection). This is due to the distributed nature of VerneMQ
(see vernemq/vernemq#1741), and results in an
invalid device connection state in Astarte: a device may be publishing
while being in a disconnected status.
Introduce a check on the order of disconnection/reconnection events, and
possibly reorder them if such a corner case happens.
Fix astarte-platform/astarte#668.

Signed-off-by: Arnaldo Cesco <arnaldo.cesco@secomind.com>
Annopaolo added a commit to Annopaolo/astarte_vmq_plugin that referenced this issue Sep 26, 2022
In rare corner cases (time difference < 10 ms), the `on_register` hook
(reconnection) may be called before the related `on_client_offline/gone`
hook (disconnection). This is due to the distributed nature of VerneMQ
(see vernemq/vernemq#1741), and results in an
invalid device connection state in Astarte: a device may be publishing
while being in a disconnected status.
Introduce a check on the order of disconnection/reconnection events, and
possibly reorder them if such a corner case happens.
Fix astarte-platform/astarte#668.

Signed-off-by: Arnaldo Cesco <arnaldo.cesco@secomind.com>
Annopaolo added a commit to Annopaolo/astarte_vmq_plugin that referenced this issue Sep 27, 2022
In rare corner cases (time difference < 10 ms), the `on_register` hook
(reconnection) may be called before the related `on_client_offline/gone`
hook (disconnection). This is due to the distributed nature of VerneMQ
(see vernemq/vernemq#1741), and results in an
invalid device connection state in Astarte: a device may be publishing
while being in a disconnected status.
Introduce a check on the order of disconnection/reconnection events, and
possibly reorder them if such a corner case happens.
Fix astarte-platform/astarte#668.

Signed-off-by: Arnaldo Cesco <arnaldo.cesco@secomind.com>
Annopaolo added a commit to Annopaolo/astarte_vmq_plugin that referenced this issue Oct 3, 2022
In rare corner cases (time difference < 10 ms), the `on_register` hook
(reconnection) may be called before the related `on_client_offline/gone`
hook (disconnection). This is due to the distributed nature of VerneMQ
(see vernemq/vernemq#1741), and results in an
invalid device connection state in Astarte: a device may be publishing
while being in a disconnected status.
Introduce a check on the order of disconnection/reconnection events, and
possibly reorder them if such a corner case happens.
Fix astarte-platform/astarte#668.

Signed-off-by: Arnaldo Cesco <arnaldo.cesco@secomind.com>
Annopaolo added a commit to Annopaolo/astarte_vmq_plugin that referenced this issue Oct 4, 2022
In rare corner cases (time difference < 10 ms), the `on_register` hook
(reconnection) may be called before the related `on_client_offline/gone`
hook (disconnection). This is due to the distributed nature of VerneMQ
(see vernemq/vernemq#1741), and results in an
invalid device connection state in Astarte: a device may be publishing
while being in a disconnected status.
Introduce a check on the order of disconnection/reconnection events, and
possibly reorder them if such a corner case happens.
Fix astarte-platform/astarte#668.

Signed-off-by: Arnaldo Cesco <arnaldo.cesco@secomind.com>
Annopaolo added a commit to Annopaolo/astarte_vmq_plugin that referenced this issue Nov 11, 2022
In rare corner cases (time difference < 10 ms), the `on_register` hook
(reconnection) may be called before the related `on_client_offline/gone`
hook (disconnection). This is due to the distributed nature of VerneMQ
(see vernemq/vernemq#1741), and results in an
invalid device connection state in Astarte: a device may be publishing
while being in a disconnected status.
Introduce a check on the order of disconnection/reconnection events, and
possibly reorder them if such a corner case happens.
Fix astarte-platform/astarte#668.

Signed-off-by: Arnaldo Cesco <arnaldo.cesco@secomind.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user experience This issue is about user experience
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant