Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chassis] phy-credo errors seen on linecard on syncd shutdown #40

Closed
arlakshm opened this issue May 6, 2022 · 3 comments
Closed

[chassis] phy-credo errors seen on linecard on syncd shutdown #40

arlakshm opened this issue May 6, 2022 · 3 comments

Comments

@arlakshm
Copy link

arlakshm commented May 6, 2022

The following errors are seen on 100G linecard when syncd is shutdown

May  6 20:55:58.273862 str2-7804-lc7-1 INFO systemd[1]: syncd.service: Succeeded.
May  6 20:55:58.274245 str2-7804-lc7-1 INFO systemd[1]: Stopped syncd service.
May  6 20:55:58.277717 str2-7804-lc7-1 INFO systemd[1]: swss.service: Succeeded.
May  6 20:55:58.278080 str2-7804-lc7-1 INFO systemd[1]: Stopped switch state service.
May  6 20:55:58.278345 str2-7804-lc7-1 INFO systemd[1]: interfaces-config.service: Succeeded.
May  6 20:55:58.278839 str2-7804-lc7-1 INFO systemd[1]: Stopped Update interfaces configuration.
May  6 20:55:58.278921 str2-7804-lc7-1 INFO systemd[1]: Stopping Update interfaces configuration...
May  6 20:56:28.406783 str2-7804-lc7-1 INFO phy-credo.py[2658]: Traceback (most recent call last):
May  6 20:56:28.406946 str2-7804-lc7-1 INFO phy-credo.py[2658]:   File "/usr/bin/phy-credo.py", line 261, in <module>
May  6 20:56:28.407042 str2-7804-lc7-1 INFO phy-credo.py[2658]:     sys.exit(main())
May  6 20:56:28.407096 str2-7804-lc7-1 INFO phy-credo.py[2658]:   File "/usr/bin/phy-credo.py", line 254, in main
May  6 20:56:28.407142 str2-7804-lc7-1 INFO phy-credo.py[2658]:     while phyd.run():
May  6 20:56:28.407198 str2-7804-lc7-1 INFO phy-credo.py[2658]:   File "/usr/bin/phy-credo.py", line 225, in run
May  6 20:56:28.407248 str2-7804-lc7-1 INFO phy-credo.py[2658]:     intf2medium = self.get_xcvr_medium_map()
May  6 20:56:28.407295 str2-7804-lc7-1 INFO phy-credo.py[2658]:   File "/usr/bin/phy-credo.py", line 194, in get_xcvr_medium_map
May  6 20:56:28.407340 str2-7804-lc7-1 INFO phy-credo.py[2658]:     for key in self.db.keys(self.db.STATE_DB, 'TRANSCEIVER_INFO|*'):
May  6 20:56:28.407388 str2-7804-lc7-1 INFO phy-credo.py[2658]: TypeError: 'NoneType' object is not iterable
May  6 20:56:28.441247 str2-7804-lc7-1 NOTICE systemd[1]: phy-credo-daemon.service: Main process exited, code=exited, status=1/FAILURE
May  6 20:56:28.441386 str2-7804-lc7-1 WARNING systemd[1]: phy-credo-daemon.service: Failed with result 'exit-code'.
May  6 20:57:05.094161 str2-7804-lc7-1 WARNING systemd[1]: hostcfgd.service: State 'stop-sigterm' timed out. Killing.
May  6 20:57:05.094333 str2-7804-lc7-1 NOTICE systemd[1]: hostcfgd.service: Killing process 3984 (hostcfgd) with signal SIGKILL.
May  6 20:57:05.096526 str2-7804-lc7-1 WARNING systemd[1]: hostcfgd.service: Main process exited, code=killed, status=9/KILL
May  6 20:57:05.096626 str2-7804-lc7-1 WARNING systemd[1]: hostcfgd.service: Failed with result 'timeout'.
May  6 20:57:05.097620 str2-7804-lc7-1 INFO systemd[1]: Stopped Host config enforcer daemon.
May  6 20:57:05.099072 str2-7804-lc7-1 INFO systemd[1]: hostcfgd.timer: Succeeded.
May  6 20:57:05.099278 str2-7804-lc7-1 INFO systemd[1]: Stopped Delays hostcfgd daemon until SONiC has started.
May  6 20:57:05.099365 str2-7804-lc7-1 INFO systemd[1]: Stopping Delays hostcfgd daemon until SONiC has started.
May  6 20:57:05.099516 str2-7804-lc7-1 INFO systemd[1]: Started Delays hostcfgd daemon until SONiC has started.
May  6 20:57:05.099936 str2-7804-lc7-1 INFO systemd[1]: updategraph.service: Succeeded.
May  6 20:57:05.101063 str2-7804-lc7-1 INFO systemd[1]: Stopped Update minigraph and set configuration based on minigraph.
May  6 20:57:05.101182 str2-7804-lc7-1 INFO systemd[1]: Stopping Update minigraph and set configuration based on minigraph...
May  6 20:57:05.101420 str2-7804-lc7-1 INFO systemd[1]: config-setup.service: Succeeded.
May  6 20:57:05.102573 str2-7804-lc7-1 INFO systemd[1]: Stopped Config initialization and migration service.
May  6 20:57:05.102698 str2-7804-lc7-1 INFO systemd[1]: Stopping Config initialization and migration service...
@Staphylo
Copy link
Member

Staphylo commented May 9, 2022

We will look into this.
Likely a case where phy-credo does not gracefully handle the database going down.

@byu343
Copy link
Contributor

byu343 commented May 9, 2022

Hi @arlakshm,
Do you know which command trigged the shutdown, 'reboot', 'config reload' or any other one? Thanks.

@arlakshm
Copy link
Author

arlakshm commented May 9, 2022

I saw this on reboot.

arlakshm pushed a commit to sonic-net/sonic-buildimage that referenced this issue Jul 6, 2022
Add support for reacting to speed change between 40G and 100G in CONFIG_DB
Fix a bug on optical bit setting.
Avoid the random error in shutdown for issue: aristanetworks/sonic#40
Avoid to run on SmartsvilleBkMs, which depends on a different driver (credo-sai).

How I did it
How to verify it
Verified on the duts that the commands printed in the log are matching the expectation and the interfaces are up.
yxieca pushed a commit to sonic-net/sonic-buildimage that referenced this issue Aug 11, 2022
Add support for reacting to speed change between 40G and 100G in CONFIG_DB
Fix a bug on optical bit setting.
Avoid the random error in shutdown for issue: aristanetworks/sonic#40
Avoid to run on SmartsvilleBkMs, which depends on a different driver (credo-sai).

How I did it
How to verify it
Verified on the duts that the commands printed in the log are matching the expectation and the interfaces are up.
skbarista pushed a commit to skbarista/sonic-buildimage that referenced this issue Aug 17, 2022
…et#10990)

Add support for reacting to speed change between 40G and 100G in CONFIG_DB
Fix a bug on optical bit setting.
Avoid the random error in shutdown for issue: aristanetworks/sonic#40
Avoid to run on SmartsvilleBkMs, which depends on a different driver (credo-sai).

How I did it
How to verify it
Verified on the duts that the commands printed in the log are matching the expectation and the interfaces are up.
@Staphylo Staphylo closed this as completed Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants