Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Celestica] PMon error #5853

Closed
lguohan opened this issue Nov 7, 2020 · 6 comments · Fixed by #5896
Closed

[Celestica] PMon error #5853

lguohan opened this issue Nov 7, 2020 · 6 comments · Fixed by #5896
Assignees

Comments

@lguohan
Copy link
Collaborator

lguohan commented Nov 7, 2020

Description

Steps to reproduce the issue:
1.
2.
3.

Describe the results you received:

syslog.80.gz:Nov  7 11:47:08.067304 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd Process Process-1:
syslog.80.gz:Nov  7 11:47:08.067304 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd Traceback (most recent call last):
syslog.80.gz:Nov  7 11:47:08.067304 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd   File "/usr/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
syslog.80.gz:Nov  7 11:47:08.067304 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd     self.run()
syslog.80.gz:Nov  7 11:47:08.067304 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd   File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
syslog.80.gz:Nov  7 11:47:08.067304 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd     self._target(*self._args, **self._kwargs)
syslog.80.gz:Nov  7 11:47:08.067371 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd   File "/usr/local/bin/xcvrd", line 963, in task_worker
syslog.80.gz:Nov  7 11:47:08.067371 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd     status, port_dict = _wrapper_get_transceiver_change_event(timeout)
syslog.80.gz:Nov  7 11:47:08.067371 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd   File "/usr/local/bin/xcvrd", line 184, in _wrapper_get_transceiver_change_event
syslog.80.gz:Nov  7 11:47:08.067371 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd     return platform_sfputil.get_transceiver_change_event(timeout)
syslog.80.gz:Nov  7 11:47:08.067371 str2-dx010-acs-6 INFO pmon#supervisord: xcvrd AttributeError: 'SfpUtilHelper' object has no attribute 'get_transceiver_change_event'
syslog.80.gz:Nov  7 11:47:09.283457 str2-dx010-acs-6 WARNING pmon#thermalctld: Fan fault warning: PSU-1 FAN-1 is broken.
syslog.80.gz:Nov  7 11:47:09.340973 str2-dx010-acs-6 WARNING pmon#thermalctld: Insufficient number of working fans warning: 1 fans are not working.

Describe the results you expected:

Additional information you deem important (e.g. issue happens only occasionally):

**Output of `show version`:**

```
(paste your output here)
```

**Attach debug file `sudo generate_dump`:**

```
(paste your output here)
```
@lguohan lguohan added Master Branch Quality P0 Priority of the issue labels Nov 7, 2020
@jleveque
Copy link
Contributor

jleveque commented Nov 7, 2020

Root cause is that there is no get_change_event() method defined for the dx010 platform in the platform's Chassis class, so xcvrd falls back to try the old sfputil plugin, and get_transceiver_change_event() is not implemented there, either.

@bingwang-ms
Copy link
Contributor

pmon service will not start on SONiC.master.478-23b0e07d, dx010-4. Is it the same issue?

@jleveque
Copy link
Contributor

@bingwang-ms: I just checked that device, and yes, this issue is present. However, it also appears that there is also a driver or hardware issue with that device, as I see errors like the following:

thermalctld IOError: Failed to read eeprom : [Errno 2] No such file or directory: '/sys/class/i2c-adapter/i2c-12/12-0050/eeprom'

You might want to try rebooting and/or power-cycling the device.

@jleveque jleveque changed the title pmon error [Celestica] PMon error Nov 10, 2020
@bingwang-ms
Copy link
Contributor

Thanks @jleveque . Already rebooted the DUT, but the issue is still present. The file '/sys/class/i2c-adapter/i2c-12/12-0050/eeprom' is missing on DUT. Is this file included in image?

@jleveque
Copy link
Contributor

jleveque commented Nov 10, 2020

@bingwang-ms: No. This is a device file. It should be created by the platform drivers. Either the drivers are not working properly or there is a hardware issue. I would check another device of the same platform running the same version of SONiC to see if the issue exists there, also.

@lguohan
Copy link
Collaborator Author

lguohan commented Nov 11, 2020

celestica upload the change by Friday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants