Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message in MQTT queue is not delivered on yggd start #82

Open
ezr-ondrej opened this issue Apr 19, 2022 · 5 comments
Open

Message in MQTT queue is not delivered on yggd start #82

ezr-ondrej opened this issue Apr 19, 2022 · 5 comments

Comments

@ezr-ondrej
Copy link
Member

Describe the bug
When yggd is starting up, it receives messages waiting in queue prior the workers get registered, that results in the message being dropped as yggd doesn't have the worker to deliver it to.

To Reproduce
Steps to reproduce the behavior:

  1. Stop yggd
  2. Deliver message for this client to mqtt broker for specific worker
  3. Start yggd
  4. See whether message got delivered to the worker

Expected behavior
Message from broker queue being delivered to the worker on yggd start-up

@subpop
Copy link
Collaborator

subpop commented Sep 26, 2022

Are you observing this in yggdrasil-0.2, main, or both?

@ezr-ondrej
Copy link
Member Author

ezr-ondrej commented Sep 29, 2022

I believe it was ygg 0.2, but maybe @adamruzicka knows if this is still an issue on main?
EDIT Adam confirmed it is 0.2, we did not update to main yet, should this issue be fixed on main?

@adamruzicka
Copy link

I just checked with main, it seems to be fixed there.

@adamruzicka
Copy link

Oh, wait. It is fixed only sometimes. The message was set to be retained and sent with qos 1.

# mosquitto
Oct 10 10:55:00 april-nightingale.example.com mosquitto[3367]: 1665399300: New connection from 192.168.122.247:43290 on port 1883.
Oct 10 10:55:00 april-nightingale.example.com mosquitto[3367]: 1665399300: New client connected from 192.168.122.247:43290 as auto-1D55AE4A-9F1B-1FB7-F774-95D2B4AC79FD (p2, c1, k15, u'april-nightingale.example.com').                                                               
Oct 10 10:55:00 april-nightingale.example.com mosquitto[3367]: 1665399300: No will message specified.
Oct 10 10:55:00 april-nightingale.example.com mosquitto[3367]: 1665399300: Sending CONNACK to auto-1D55AE4A-9F1B-1FB7-F774-95D2B4AC79FD (0, 0)                                                                                                                                         
Oct 10 10:55:00 april-nightingale.example.com mosquitto[3367]: 1665399300: Received PUBLISH from auto-1D55AE4A-9F1B-1FB7-F774-95D2B4AC79FD (d0, q1, r1, m1, 'yggdrasil/april-nightingale.example.com/data/in', ... (523 bytes))                                                        
Oct 10 10:55:00 april-nightingale.example.com mosquitto[3367]: 1665399300: Sending PUBACK to auto-1D55AE4A-9F1B-1FB7-F774-95D2B4AC79FD (m1, rc0)                                                                                                                                       
Oct 10 10:55:00 april-nightingale.example.com mosquitto[3367]: 1665399300: Received DISCONNECT from auto-1D55AE4A-9F1B-1FB7-F774-95D2B4AC79FD                                                                                                                                          
Oct 10 10:55:00 april-nightingale.example.com mosquitto[3367]: 1665399300: Client auto-1D55AE4A-9F1B-1FB7-F774-95D2B4AC79FD disconnected.

===== started yggdasild here

Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: New connection from 192.168.122.247:43292 on port 1883.
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: New client connected from 192.168.122.247:43292 as april-nightingale.example.com (p2, c1, k30, u'april-nightingale.example.com').                                                                           
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Will message specified (346 bytes) (r0, q1).
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307:         yggdrasil/april-nightingale.example.com/control/out
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Sending CONNACK to april-nightingale.example.com (0, 0)
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Received SUBSCRIBE from april-nightingale.example.com
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307:         yggdrasil/april-nightingale.example.com/data/in (QoS 1)
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: april-nightingale.example.com 1 yggdrasil/april-nightingale.example.com/data/in                                                                                                                             
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Sending SUBACK to april-nightingale.example.com
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Sending PUBLISH to april-nightingale.example.com (d0, q1, r1, m1, 'yggdrasil/april-nightingale.example.com/data/in', ... (523 bytes))                                                                       
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Denied PUBLISH from april-nightingale.example.com (d0, q0, r0, m0, 'yggdrasil/april-nightingale.example.com/data/out', ... (0 bytes))                                                                       
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Received SUBSCRIBE from april-nightingale.example.com
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307:         yggdrasil/april-nightingale.example.com/control/in (QoS 1)                                                                                                                                          
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: april-nightingale.example.com 1 yggdrasil/april-nightingale.example.com/control/in                                                                                                                          
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Sending SUBACK to april-nightingale.example.com
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Received PUBLISH from april-nightingale.example.com (d0, q1, r0, m3, 'yggdrasil/april-nightingale.example.com/control/out', ... (524 bytes))                                                                
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Sending PUBACK to april-nightingale.example.com (m3, rc0)
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Received PUBACK from april-nightingale.example.com (Mid: 1, RC:0)                                                                                                                                           
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Received PUBLISH from april-nightingale.example.com (d0, q1, r0, m4, 'yggdrasil/april-nightingale.example.com/control/out', ... (539 bytes))                                                                
Oct 10 10:55:07 april-nightingale.example.com mosquitto[3367]: 1665399307: Sending PUBACK to april-nightingale.example.com (m4, rc0)
# /usr/sbin/yggdrasild
[yggd] 2022/10/10 10:55:07 /root/yggdrasil/cmd/yggd/main.go:207: starting yggd version 0.2.98
[yggd] 2022/10/10 10:55:07 /root/yggdrasil/cmd/yggd/main.go:251: listening on socket: @yggd-dispatcher-Ninmtj
[yggd] 2022/10/10 10:55:07 /root/yggdrasil/cmd/yggd/main.go:382: starting worker: foreman
[yggd] 2022/10/10 10:55:07 /root/yggdrasil/cmd/yggd/main.go:408: cannot start watching '/etc/yggdrasil/tags.toml': lstat /etc/yggdrasil/tags.toml: no such file or directory                                                                                                           
[yggd] 2022/10/10 10:55:07 /root/yggdrasil/cmd/yggd/grpc.go:205: cannot route message  cea54f7b-8d10-4064-ae44-222a5edf7e12 to directive: foreman                                                                                                                                      
[yggd] 2022/10/10 10:55:07 /root/yggdrasil/cmd/yggd/grpc.go:84: worker registered: {pid:3383 handler:foreman addr:@ygg-foreman-KAcKCW features:map[] detachedContent:true}                                                                                                             
[yggd] 2022/10/10 10:55:07 /root/yggdrasil/internal/transport/mqtt.go:141: published message to topic yggdrasil/april-nightingale.example.com/control/out                                                                                                                              
[yggd] 2022/10/10 10:55:07 /root/yggdrasil/internal/transport/mqtt.go:141: published message to topic yggdrasil/april-nightingale.example.com/control/out
^C

# /usr/sbin/yggdrasild
[yggd] 2022/10/10 10:56:47 /root/yggdrasil/cmd/yggd/main.go:207: starting yggd version 0.2.98
[yggd] 2022/10/10 10:56:47 /root/yggdrasil/cmd/yggd/main.go:251: listening on socket: @yggd-dispatcher-RxagTL
[yggd] 2022/10/10 10:56:52 /root/yggdrasil/cmd/yggd/main.go:382: starting worker: foreman
[yggd] 2022/10/10 10:56:52 /root/yggdrasil/cmd/yggd/main.go:408: cannot start watching '/etc/yggdrasil/tags.toml': lstat /etc/yggdrasil/tags.toml: no such file or directory                                                                                                           
[yggd] 2022/10/10 10:56:52 /root/yggdrasil/cmd/yggd/grpc.go:205: cannot route message  cea54f7b-8d10-4064-ae44-222a5edf7e12 to directive: foreman                                                                                                                                      
[yggd] 2022/10/10 10:56:52 /root/yggdrasil/internal/transport/mqtt.go:141: published message to topic yggdrasil/april-nightingale.example.com/control/out                                                                                                                              
[yggd] 2022/10/10 10:56:52 /root/yggdrasil/cmd/yggd/grpc.go:84: worker registered: {pid:3405 handler:foreman addr:@ygg-foreman-MQZNWK features:map[] detachedContent:true}                                                                                                             
[yggd] 2022/10/10 10:56:52 /root/yggdrasil/internal/transport/mqtt.go:141: published message to topic yggdrasil/april-nightingale.example.com/control/out                                                                                                                              
^C

# /usr/sbin/yggdrasild
[yggd] 2022/10/10 10:56:59 /root/yggdrasil/cmd/yggd/main.go:207: starting yggd version 0.2.98
[yggd] 2022/10/10 10:56:59 /root/yggdrasil/cmd/yggd/main.go:251: listening on socket: @yggd-dispatcher-trqWtL
[yggd] 2022/10/10 10:56:59 /root/yggdrasil/cmd/yggd/main.go:382: starting worker: foreman
[yggd] 2022/10/10 10:56:59 /root/yggdrasil/cmd/yggd/main.go:408: cannot start watching '/etc/yggdrasil/tags.toml': lstat /etc/yggdrasil/tags.toml: no such file or directory                                                                                                           
[yggd] 2022/10/10 10:56:59 /root/yggdrasil/internal/transport/mqtt.go:141: published message to topic yggdrasil/april-nightingale.example.com/control/out                                                                                                                              
[yggd] 2022/10/10 10:56:59 /root/yggdrasil/cmd/yggd/grpc.go:205: cannot route message  cea54f7b-8d10-4064-ae44-222a5edf7e12 to directive: foreman                                                                                                                                      
[yggd] 2022/10/10 10:56:59 /root/yggdrasil/cmd/yggd/grpc.go:84: worker registered: {pid:3418 handler:foreman addr:@ygg-foreman-tqyQoc features:map[] detachedContent:true}                                                                                                             
[yggd] 2022/10/10 10:56:59 /root/yggdrasil/internal/transport/mqtt.go:141: published message to topic yggdrasil/april-nightingale.example.com/control/out                                                                                                                              
^C

# /usr/sbin/yggdrasild
[yggd] 2022/10/10 10:57:03 /root/yggdrasil/cmd/yggd/main.go:207: starting yggd version 0.2.98
[yggd] 2022/10/10 10:57:03 /root/yggdrasil/cmd/yggd/main.go:251: listening on socket: @yggd-dispatcher-lbCVCd
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/cmd/yggd/main.go:382: starting worker: foreman
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/cmd/yggd/main.go:408: cannot start watching '/etc/yggdrasil/tags.toml': lstat /etc/yggdrasil/tags.toml: no such file or directory                                                                                                           
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/cmd/yggd/grpc.go:84: worker registered: {pid:3432 handler:foreman addr:@ygg-foreman-RzUfvY features:map[] detachedContent:true}                                                                                                             
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/internal/transport/mqtt.go:141: published message to topic yggdrasil/april-nightingale.example.com/control/out                                                                                                                              
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/internal/http/client.go:41: sending HTTP request: GET https://april-nightingale.example.com:8443/ssh/jobs/2533a549-5320-4f4a-bfb7-0fda93c266bf                                                                                              
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/internal/transport/mqtt.go:141: published message to topic yggdrasil/april-nightingale.example.com/control/out                                                                                                                              
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/cmd/yggd/grpc.go:262: dispatched message cea54f7b-8d10-4064-ae44-222a5edf7e12 to worker foreman
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/cmd/yggd/worker.go:114: /usr/libexec/yggdrasil/foreman_ygg_worker: 2022/10/10 10:57:04 Starting job 2533a549-5320-4f4a-bfb7-0fda93c266bf                                                                                                    
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/cmd/yggd/worker.go:114: /usr/libexec/yggdrasil/foreman_ygg_worker: 2022/10/10 10:57:04 script of 8 bytes written in : "/tmp/ygg_rex544351191"                                                                                               
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/cmd/yggd/worker.go:114: /usr/libexec/yggdrasil/foreman_ygg_worker: 2022/10/10 10:57:04 started script process: 3441                                                                                                                         
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/cmd/yggd/worker.go:114: /usr/libexec/yggdrasil/foreman_ygg_worker: 2022/10/10 10:57:04 Finished job 2533a549-5320-4f4a-bfb7-0fda93c266bf                                                                                                    
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/internal/http/client.go:58: sending HTTP request: POST https://april-nightingale.example.com:8443/ssh/jobs/2533a549-5320-4f4a-bfb7-0fda93c266bf/update                                                                                      
[yggd] 2022/10/10 10:57:04 /root/yggdrasil/cmd/yggd/grpc.go:156: received message 7415427c-1302-4a17-8c35-a53cd01a9478
^C

# /usr/sbin/yggdrasild
[yggd] 2022/10/10 10:58:07 /root/yggdrasil/cmd/yggd/main.go:207: starting yggd version 0.2.98
[yggd] 2022/10/10 10:58:07 /root/yggdrasil/cmd/yggd/main.go:251: listening on socket: @yggd-dispatcher-fgCJsF
[yggd] 2022/10/10 10:58:07 /root/yggdrasil/cmd/yggd/main.go:382: starting worker: foreman
[yggd] 2022/10/10 10:58:07 /root/yggdrasil/cmd/yggd/main.go:408: cannot start watching '/etc/yggdrasil/tags.toml': lstat /etc/yggdrasil/tags.toml: no such file or directory                                                                                                           
[yggd] 2022/10/10 10:58:07 /root/yggdrasil/cmd/yggd/grpc.go:205: cannot route message  cea54f7b-8d10-4064-ae44-222a5edf7e12 to directive: foreman                                                                                                                                      
[yggd] 2022/10/10 10:58:07 /root/yggdrasil/cmd/yggd/grpc.go:84: worker registered: {pid:3454 handler:foreman addr:@ygg-foreman-FYJwOe features:map[] detachedContent:true}                                                                                                             
[yggd] 2022/10/10 10:58:07 /root/yggdrasil/internal/transport/mqtt.go:141: published message to topic yggdrasil/april-nightingale.example.com/control/out                                                                                                                              
[yggd] 2022/10/10 10:58:07 /root/yggdrasil/internal/transport/mqtt.go:141: published message to topic yggdrasil/april-nightingale.example.com/control/out

@subpop
Copy link
Collaborator

subpop commented Feb 10, 2023

This might have been fixed inadvertently, especially now on main after 0df7719. This changes the dispatcher IPC to one based on D-Bus, which changes the way workers start up, the way messages are held or dropped at the IPC boundary between worker and dispatcher, and the way the transport and dispatcher types within yggd interact and start.

I assume it will take a little work to reproduce this again now that workers need to be rewritten to interact with yggd, but it should be possible to try and replicate with the echo worker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants