Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPI-report daemon occasionally crashes without relevant log information #809

Closed
gsanchietti opened this issue Oct 1, 2024 · 3 comments
Closed
Assignees
Labels
verified All test cases were verified successfully

Comments

@gsanchietti
Copy link
Member

gsanchietti commented Oct 1, 2024

Description:

The dpireport daemon sometimes terminates unexpectedly without providing any useful information in the logs. When the daemon dies, it becomes impossible to retrieve network traffic data used inside Realtime Monitoring and Historical Monitoring.

Log extract:

Sep 30 13:54:25 fw dpireport[4801]: INFO: Started dpireport
Sep 30 13:54:25 fw dpireport[4801]: INFO: Starting threads ...

Impact:

  • No network traffic data can be retrieved in Realtime Monitoring and Historical Monitoring when the daemon is down.

Proposed solution:

  • Avoid using procd to manage the dpi-report daemon.
  • Implement a more robust supervisor daemon capable of handling restarts and better monitoring the health of the service.

Components:

NethSecurity Image: 8-23.05.4-ns.1.2.0

References:

@gsanchietti gsanchietti added the bug label Oct 1, 2024
@gsanchietti gsanchietti added this to the NethSecurity 8.3 milestone Oct 1, 2024
@gsanchietti gsanchietti moved this to In progress 🛠 in NethSecurity Oct 1, 2024
gsanchietti added a commit that referenced this issue Oct 16, 2024
Changes:

- restart threads if netifyd socket disappears
- wait for netifyd socket indefinitely
- log socket connection and disconnection to ease debug

#809
Copy link
Contributor

Testing image version: 8-23.05.5-ns.1.2.99-alpha1-83-g34a616676f

@github-actions github-actions bot added the testing Packages are available from testing repositories label Oct 16, 2024
@gsanchietti
Copy link
Member Author

The daemon was failing on netifyd restart.
The implemented change waits for netifyd socket and restart the threads.

Test case

  • Make sure dpireport is running: ps aux | grep dpi
  • Stop netifyd: /etc/init.d/netifyd stop
  • Verify dpireport is still running after 10 seconds
  • Look inside the log for dpireport messages: grep dpi /var/log/messages
  • Start netifyd: /etc/init.d/netifyd start
  • Verify dpireport has been connected to socket: grep dpi /var/log/messages

@cotosso
Copy link
Contributor

cotosso commented Oct 16, 2024

With the new image everything works flawlessy:

root@NSec8-VM-davidem:~# ps ax | grep dpi
 4241 ?        Sl     0:00 /usr/bin/python /usr/bin/dpireport
/etc/init.d/netifyd stop

after stopping dpireport continues running

root@NSec8-VM-davidem:~# ps ax | grep dpi
 4241 ?        Sl     0:00 /usr/bin/python /usr/bin/dpireport
 7393 pts/1    S+     0:00 grep dpi
root@NSec8-VM-davidem:~# date
Wed Oct 16 09:59:20 UTC 2024
root@NSec8-VM-davidem:~# date
Wed Oct 16 09:59:37 UTC 2024
root@NSec8-VM-davidem:~# ps ax | grep dpi
 4241 ?        Sl     0:00 /usr/bin/python /usr/bin/dpireport
 7417 pts/1    R+     0:00 grep dpi

/var/log/messages :

Oct 16 09:58:39 NSec8-VM-davidem dpireport[4241]: INFO: Connected to socket
Oct 16 09:59:18 NSec8-VM-davidem dpireport[4241]: ERROR: Connection to netifyd socket /var/run/netifyd/netifyd.sock closed
Oct 16 09:59:18 NSec8-VM-davidem dpireport[4241]: INFO: Closing threads ...
Oct 16 09:59:19 NSec8-VM-davidem dpireport[4241]: INFO: Starting threads ...
Oct 16 09:59:19 NSec8-VM-davidem dpireport[4241]: INFO: Netifyd socket /var/run/netifyd/netifyd.sock not found. Waiting 1 seconds
Oct 16 09:59:20 NSec8-VM-davidem dpireport[4241]: INFO: Netifyd socket /var/run/netifyd/netifyd.sock not found. Waiting 2 seconds
Oct 16 09:59:22 NSec8-VM-davidem dpireport[4241]: INFO: Netifyd socket /var/run/netifyd/netifyd.sock not found. Waiting 4 seconds
Oct 16 09:59:26 NSec8-VM-davidem dpireport[4241]: INFO: Netifyd socket /var/run/netifyd/netifyd.sock not found. Waiting 8 seconds
Oct 16 09:59:34 NSec8-VM-davidem dpireport[4241]: INFO: Netifyd socket /var/run/netifyd/netifyd.sock not found. Waiting 16 seconds
Oct 16 09:59:50 NSec8-VM-davidem dpireport[4241]: INFO: Netifyd socket /var/run/netifyd/netifyd.sock not found. Waiting 32 seconds

then I restated it, it's still there

root@NSec8-VM-davidem:~# date
Wed Oct 16 09:59:57 UTC 2024
root@NSec8-VM-davidem:~# ps ax | grep dpi
 4241 ?        Sl     0:00 /usr/bin/python /usr/bin/dpireport

logs show new connection to socket:

Oct 16 09:59:50 NSec8-VM-davidem dpireport[4241]: INFO: Netifyd socket /var/run/netifyd/netifyd.sock not found. Waiting 32 seconds
Oct 16 10:00:22 NSec8-VM-davidem dpireport[4241]: INFO: Connected to socket

@cotosso cotosso added verified All test cases were verified successfully and removed testing Packages are available from testing repositories labels Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
verified All test cases were verified successfully
Projects
Archived in project
Development

No branches or pull requests

3 participants