Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[T2] Continuous neighorch INFO logs emitted in orchagent #20214

Open
arista-nwolfe opened this issue Sep 10, 2024 · 3 comments
Open

[T2] Continuous neighorch INFO logs emitted in orchagent #20214

arista-nwolfe opened this issue Sep 10, 2024 · 3 comments
Labels
Chassis 🤖 Modular chassis support Triaged this issue has been triaged

Comments

@arista-nwolfe
Copy link
Contributor

arista-nwolfe commented Sep 10, 2024

Seen on 202405.

If we turn on INFO log level in orchagent:
swssloglevel -n 0 -l INFO -c orchagent
swssloglevel -n 1 -l INFO -c orchagent

We see these logs continuously emitted:

2024 Sep  4 16:43:06.166952 cmp214-5 INFO swss1#orchagent: :- doLagMemberTask: Failed to locate LAG cmp214-5|asic1|PortChannel106
2024 Sep  4 16:43:06.166952 cmp214-5 INFO swss1#orchagent: :- doTask: Port eth1 doesn't exist
2024 Sep  4 16:43:06.166968 cmp214-5 INFO swss1#orchagent: :- doVoqSystemNeighTask: Port cmp214-5|asic1|Ethernet-IB1 doesn't exist
2024 Sep  4 16:43:06.166996 cmp214-5 INFO swss1#orchagent: :- doVoqSystemNeighTask: Port cmp214-5|asic1|Ethernet-IB1 doesn't exist
2024 Sep  4 16:43:06.166996 cmp214-5 INFO swss1#orchagent: :- doVoqSystemNeighTask: Port cmp214-5|asic1|Ethernet224 doesn't exist
2024 Sep  4 16:43:06.167009 cmp214-5 INFO swss1#orchagent: :- doVoqSystemNeighTask: Port cmp214-5|asic1|Ethernet224 doesn't exist
2024 Sep  4 16:43:06.167022 cmp214-5 INFO swss1#orchagent: :- doVoqSystemNeighTask: Port cmp214-5|asic1|PortChannel106 doesn't exist
2024 Sep  4 16:43:06.217314 cmp214-5 INFO swss1#orchagent: :- doVoqSystemNeighTask: Port cmp214-5|asic1|PortChannel106 doesn't exist
2024 Sep  4 16:43:06.217354 cmp214-5 INFO swss1#orchagent: :- doLagMemberTask: Failed to locate LAG cmp214-5|asic1|PortChannel106
2024 Sep  4 16:43:06.217386 cmp214-5 INFO swss1#orchagent: :- doLagMemberTask: Failed to locate LAG cmp214-5|asic1|PortChannel106

neighorch.cpp checks that the given port exists in portsorch:

1718             if (!gPortsOrch->getPort(alias, p))
1719             {
1720                 SWSS_LOG_INFO("Port %s doesn't exist", alias.c_str());
1721                 it++;
1722                 continue;
1723             }

portsorch->getPort just checks that the given port name exists in m_portList:

1357 bool PortsOrch::getPort(string alias, Port &p)
1358 {
1359     SWSS_LOG_ENTER();
1360
1361     if (m_portList.find(alias) == m_portList.end())
1362     {
1363         return false;
1364     }
1365     else
1366     {
1367         p = m_portList[alias];
1368         return true;
1369     }
1370 }

I added tracing into portsorch at every place that m_portList is modified and I saw that ports local to the orchagent instance aren't added with the <LC>|<asic>| prefix in the name.
For example focusing on cmp214-5|asic0|Ethernet72.
This port belongs to swss0 and we see it only added as Ethernet72:

2024 Sep  5 00:32:59.960825 cmp214-5 ERR swss0#orchagent: :- setPort: NATHAN6 adding alias: Ethernet72
2024 Sep  5 00:32:59.999234 cmp214-5 ERR swss0#orchagent: :- setPort: NATHAN6 adding alias: Ethernet72
2024 Sep  5 00:33:00.029978 cmp214-5 ERR swss0#orchagent: :- setPort: NATHAN6 adding alias: Ethernet72
2024 Sep  5 00:33:00.147242 cmp214-5 ERR swss0#orchagent: :- setPort: NATHAN6 adding alias: Ethernet72

While a remote swss instance (swss1) sees it added as cmp214-5|asic0|Ethernet72

2024 Sep  5 00:10:25.100809 cmp214-5 ERR swss1#orchagent: :- setPort: NATHAN6 adding alias: cmp214-5|asic0|Ethernet72
2024 Sep  5 00:10:26.460190 cmp214-5 ERR swss1#orchagent: :- setPort: NATHAN6 adding alias: cmp214-5|asic0|Ethernet72
2024 Sep  5 00:32:53.094595 cmp214-5 ERR swss1#orchagent: :- setPort: NATHAN6 adding alias: cmp214-5|asic0|Ethernet72
2024 Sep  5 00:33:00.682463 cmp214-5 ERR swss1#orchagent: :- setPort: NATHAN6 adding alias: cmp214-5|asic0|Ethernet72

Hence why only swss0 emits this error:

2024 Sep  5 18:25:40.335690 cmp214-5 INFO swss0#orchagent: :- doVoqSystemNeighTask: Port cmp214-5|asic0|Ethernet72 doesn't exist
2024 Sep  5 18:25:40.335690 cmp214-5 INFO swss0#orchagent: :- doVoqSystemNeighTask: Port cmp214-5|asic0|Ethernet72 doesn't exist

So the issue is one of the following:
-We should be adding the system port name with the prefix <LC>|<asic>| to m_portList in portsorch for local ports
OR
-We should translate the system port name to strip the prefix <LC>|<asic>| before querying m_portList in portsorch for local ports.

@kenneth-arista
Copy link
Contributor

@arlakshm @wenyiz2021

@arista-nwolfe
Copy link
Contributor Author

@arlakshm mentioned that these logs are also seen on 202205 and are likely benign.
Not closing this bug though because we should probably still track these logs via a bug as these excess logs make it difficult to debug with more verbose logging as our process will get rate-limited quicker.

@arista-nwolfe arista-nwolfe changed the title [T2][202405] Continuous neighorch INFO logs emitted in orchagent [T2] Continuous neighorch INFO logs emitted in orchagent Sep 10, 2024
@zhangyanzhao zhangyanzhao added the Chassis 🤖 Modular chassis support label Sep 11, 2024
@zhangyanzhao
Copy link
Collaborator

MSFT SONiC team will take a look.

@zhangyanzhao zhangyanzhao added the Triaged this issue has been triaged label Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chassis 🤖 Modular chassis support Triaged this issue has been triaged
Projects
Status: No status
Development

No branches or pull requests

3 participants