Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nvidia][hsflowd] Fix Dropmon co-operation issues related to HW stop #73

Closed
wants to merge 5 commits into from

Conversation

vivekrnv
Copy link
Owner

@vivekrnv vivekrnv commented Aug 19, 2023

Why I did it

Sflow service stop is causing other dropmon clients to not receive any drops thereafter,

Repro Steps:

config feature state sflow disabled 
config sflow enable
config sflow collector add temp1 192.168.1.10 
	
<Start other dropmon clients>
<Client will be receiving the drops >

config sflow disable <Exit hsflowd process>	
<Other client will stop recieving drops>
Work item tracking
  • Microsoft ADO (number only):

How I did it

  • During process exit, don't stop HW in Drop Mon. for hsflowd. HW Drops are controlled by other daemon in nvidia platform.
  • As for SW drops, only start sw drops in NET_DM when the sw=on is provided in hsflowd.conf
  • Don't log feedcontrolerrors for CONFIG since if feedcontrolerrors > 0, application won't stop sw drops even when it exits. CONFIG can likely fail with -EBUSY if the NET_DM is already configured by another daemon.

How to verify it

  1. Verify the steps and see if the client is receiving the drops, default sw=off case.
Aug 19 00:32:44.212065 r-leopard-41 NOTICE sflow#sflowmgrd: :- sflowHandleService: Starting hsflowd service
Aug 19 00:32:44.212237 r-leopard-41 INFO sflow#hsflowd: started
Aug 19 00:32:44.212282 r-leopard-41 INFO sflow#hsflowd: autoload SONIC and PSAMPLE modules
Aug 19 00:32:44.212282 r-leopard-41 INFO sflow#hsflowd: drop-monitor support for SONiC
Aug 19 00:33:40.446770 r-leopard-41 INFO sflow#hsflowd: dropmon state INIT -> GET_FAMILY
Aug 19 00:33:40.446770 r-leopard-41 INFO sflow#hsflowd: dropmon state GET_FAMILY -> GOT_GROUP
Aug 19 00:33:41.037737 r-leopard-41 INFO sflow#hsflowd: dropmon state GOT_GROUP -> JOIN_GROUP
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: dropmon state JOIN_GROUP -> CONFIGURE
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: Configuring DropMon Failed, Module is already in Monitoring State, Continue...
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: message repeated 2 times: [ Configuring DropMon Failed, Module is already in Monitoring State, Continue...]
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: dropmon state CONFIGURE -> START
Aug 19 00:34:42.042191 r-leopard-41 INFO sflow#hsflowd: dropmon state START -> RUN
Aug 19 00:35:07.746616 r-leopard-41 INFO sflow#hsflowd: Received SIGTERM
Aug 19 00:35:07.445474 r-leopard-41 INFO sflow#hsflowd: dropmon state RUN -> STOP
Aug 19 00:35:07.795261 r-leopard-41 INFO sflow#hsflowd: stopped
  1. Test the sw=on case
Aug 19 00:55:29.802990 r-leopard-41 INFO sflow#hsflowd: started
Aug 19 00:55:29.802990 r-leopard-41 INFO sflow#hsflowd: autoload SONIC and PSAMPLE modules
Aug 19 00:55:29.802990 r-leopard-41 INFO sflow#hsflowd: drop-monitor support for SONiC
Aug 19 00:55:38.213225 r-leopard-41 INFO sflow#hsflowd: dropmon state INIT -> GET_FAMILY
Aug 19 00:55:38.213225 r-leopard-41 INFO sflow#hsflowd: dropmon state GET_FAMILY -> GOT_GROUP
Aug 19 00:55:38.804221 r-leopard-41 INFO sflow#hsflowd: dropmon state GOT_GROUP -> JOIN_GROUP
Aug 19 00:55:39.808749 r-leopard-41 INFO sflow#hsflowd: dropmon state JOIN_GROUP -> CONFIGURE
Aug 19 00:55:39.808749 r-leopard-41 INFO sflow#hsflowd: Configuring DropMon Failed, Module is already in Monitoring State, Continue...
Aug 19 00:55:40.813554 r-leopard-41 INFO sflow#hsflowd: message repeated 2 times: [ Configuring DropMon Failed, Module is already in Monitoring State, Continue...]
Aug 19 00:55:40.813554 r-leopard-41 INFO sflow#hsflowd: dropmon state CONFIGURE -> START
Aug 19 00:55:42.430937 r-leopard-41 INFO sflow#hsflowd: dropmon state START -> RUN
Aug 19 00:56:36.441267 r-leopard-41 INFO sflow#hsflowd: Received SIGTERM
Aug 19 00:56:36.445474 r-leopard-41 INFO sflow#hsflowd: dropmon: graceful shutdown: turning off feed
Aug 19 00:56:36.445474 r-leopard-41 INFO sflow#hsflowd: dropmon state RUN -> STOP
Aug 19 00:56:36.507932 r-leopard-41 INFO sflow#hsflowd: stopped

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@vivekrnv vivekrnv marked this pull request as draft August 21, 2023 20:11
@vivekrnv vivekrnv marked this pull request as ready for review August 22, 2023 01:59
@vivekrnv vivekrnv marked this pull request as draft August 31, 2023 18:07
@vivekrnv vivekrnv closed this Sep 13, 2023
vivekrnv pushed a commit that referenced this pull request Oct 13, 2023
…e latest HEAD automatically (sonic-net#15016)

src/wpasupplicant/sonic-wpa-supplicant

* a24412c25 - (HEAD -> 202205, origin/master, origin/HEAD, origin/202211, origin/202205, master) [mka]: Fix unexpected cleanup (#73) (8 days ago) [Ze Gan]
* 26d1da0bc - [mka]: Fix re-establishment by reset MI (#72) (8 days ago) [Ze Gan]
* f07e0a097 - [azp]: Update build pipeline to build for Bullseye (#70) (4 weeks ago) [Ze Gan]
*   2c69e2cda - Use github code scanning instead of LGTM (#69) (6 months ago) [Liu Shilong]
|\  
| * 23abb04e5 - fix (6 months ago) [shilongliu]
| * f34d68fe6 - libdbus-1-dev (6 months ago) [shilongliu]
| * dc2dd881e - add dbus (6 months ago) [shilongliu]
| * 5de037661 - use swsscommon packages (6 months ago) [shilongliu]
| * 32c5a2729 - Use github code scanning instead of LGTM (6 months ago) [shilongliu]
|/  
* aa731b96f - [azp]: Install libyang in azure pipeline (#68) (8 months ago) [Hua Liu]
* 71b635d74 - Revert "[Azp]: Upgrade Azp to bullseye (#49)" (#66) (9 months ago) [Ze Gan]
* 7aa4e6fa4 - Adding Microsoft SECURITY.MD (#58) (9 months ago) [microsoft-github-policy-service[bot]]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant