Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[show] Add 'show' CLI for system-health feature #971

Merged
merged 25 commits into from
Oct 12, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
d58195e
Add 'show' CLI for system-health feature
Jun 15, 2020
61d90ac
Add unit test for 'system-health' feature, add support for testing in…
Jun 30, 2020
0fc2d48
Merge branch 'master' of https://github.com/Azure/sonic-utilities int…
Jul 8, 2020
04d0672
Fix additional comments
Jul 13, 2020
c04dfed
Fix comments
Jul 14, 2020
d33454c
Update Command-Reference.md
shlomibitton Jul 15, 2020
28f9625
Fix LGTM alerts
Jul 15, 2020
393c3de
Fix comment
shlomibitton Jul 16, 2020
99aa1d5
Update Command-Reference.md
shlomibitton Jul 19, 2020
bfe5b2e
Update Command-Reference.md
shlomibitton Jul 26, 2020
9138ab4
Change 'summary' output and adapt test and reference to the new change
Jul 28, 2020
bd7c529
Update main.py
shlomibitton Jul 28, 2020
2d90fc3
Fix multiline output for expected output
Aug 2, 2020
112147f
keep output aligned
Aug 5, 2020
0e59ae4
Fix import for unit testing after community change
Aug 12, 2020
713051e
Merge branch 'master' of https://github.com/Azure/sonic-utilities int…
Aug 12, 2020
f65320e
Add clicommon for @cli.group after community change
Aug 12, 2020
d9be51f
Merge branch 'master' into shlomi_system_health_cli
shlomibitton Sep 7, 2020
eb6409f
Align changes in the feature to the CLI on commit
Sep 9, 2020
e0794ae
Update main.py
shlomibitton Sep 9, 2020
872030f
Move new group CLI into a separate file
Sep 10, 2020
bd57109
Merge branch 'master' into shlomi_system_health_cli
shlomibitton Sep 10, 2020
ee85d43
Organize imports per PEP8 standards
shlomibitton Sep 13, 2020
5be5fd6
Organize imports per PEP8 standards
shlomibitton Sep 14, 2020
5b1c981
Reformat docstring for readability
Sep 16, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 185 additions & 0 deletions doc/Command-Reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@
* [System State](#system-state)
* [Processes](#processes)
* [Services & Memory](#services--memory)
* [System-Health](#System-Health)
* [VLAN & FDB](#vlan--fdb)
* [VLAN](#vlan)
* [VLAN show commands](#vlan-show-commands)
Expand Down Expand Up @@ -5938,6 +5939,190 @@ NOTE: This command is not working. It crashes as follows. A bug ticket is opened

Go Back To [Beginning of the document](#) or [Beginning of this section](#System-State)

Go Back To [Beginning of the document](#) or [Beginning of this section](#System-Health)

### System-Health

These commands are used to monitor the system current running services and hardware state.

**show system-health summary**

This command displays the current status of 'Services' and 'Hardware' under monitoring.
If any of the elements under each of these two sections is 'Not OK' a proper message will appear under the relevant section.

- Usage:
```
show system-health summary
```

- Example:
```
admin@sonic:~$ show system-health summary
System status summary

System status LED red
Services:
Status: Not OK
Not Running: 'telemetry', 'sflowmgrd'
Hardware:
Status: OK
```
```
admin@sonic:~$ show system-health summary
System status summary

System status LED green
Services:
Status: OK
Hardware:
Status: OK
```

**show system-health monitor-list**

This command displays a list of all current 'Services' and 'Hardware' being monitored, their status and type.

- Usage:
```
show system-health monitor-list
```

- Example:
```
admin@sonic:~$ show system-health monitor-list
System services and devices monitor list

Name Status Type
-------------- -------- ----------
telemetry Not OK Process
orchagent Not OK Process
neighsyncd OK Process
vrfmgrd OK Process
dialout_client OK Process
zebra OK Process
rsyslog OK Process
snmpd OK Process
redis_server OK Process
intfmgrd OK Process
vxlanmgrd OK Process
lldpd_monitor OK Process
portsyncd OK Process
var-log OK Filesystem
lldpmgrd OK Process
syncd OK Process
sonic OK System
buffermgrd OK Process
portmgrd OK Process
staticd OK Process
bgpd OK Process
lldp_syncd OK Process
bgpcfgd OK Process
snmp_subagent OK Process
root-overlay OK Filesystem
fpmsyncd OK Process
sflowmgrd OK Process
vlanmgrd OK Process
nbrmgrd OK Process
PSU 2 OK PSU
psu_1_fan_1 OK Fan
psu_2_fan_1 OK Fan
fan11 OK Fan
fan10 OK Fan
fan12 OK Fan
ASIC OK ASIC
fan1 OK Fan
PSU 1 OK PSU
fan3 OK Fan
fan2 OK Fan
fan5 OK Fan
fan4 OK Fan
fan7 OK Fan
fan6 OK Fan
fan9 OK Fan
fan8 OK Fan
```

**show system-health detail**

This command displays the current status of 'Services' and 'Hardware' under monitoring.
If any of the elements under each of these two sections is 'Not OK' a proper message will appear under the relevant section.
In addition, displays a list of all current 'Services' and 'Hardware' being monitored and a list of ignored elements.

- Usage:
```
show system-health detail
```

- Example:
```
admin@sonic:~$ show system-health detail
System status summary

System status LED red
Services:
Status: Not OK
Not Running: 'telemetry', 'orchagent'
Hardware:
Status: OK

System services and devices monitor list

Name Status Type
-------------- -------- ----------
telemetry Not OK Process
orchagent Not OK Process
neighsyncd OK Process
vrfmgrd OK Process
dialout_client OK Process
zebra OK Process
rsyslog OK Process
snmpd OK Process
redis_server OK Process
intfmgrd OK Process
vxlanmgrd OK Process
lldpd_monitor OK Process
portsyncd OK Process
var-log OK Filesystem
lldpmgrd OK Process
syncd OK Process
sonic OK System
buffermgrd OK Process
portmgrd OK Process
staticd OK Process
bgpd OK Process
lldp_syncd OK Process
bgpcfgd OK Process
snmp_subagent OK Process
root-overlay OK Filesystem
fpmsyncd OK Process
sflowmgrd OK Process
vlanmgrd OK Process
nbrmgrd OK Process
PSU 2 OK PSU
psu_1_fan_1 OK Fan
psu_2_fan_1 OK Fan
fan11 OK Fan
fan10 OK Fan
fan12 OK Fan
ASIC OK ASIC
fan1 OK Fan
PSU 1 OK PSU
fan3 OK Fan
fan2 OK Fan
fan5 OK Fan
fan4 OK Fan
fan7 OK Fan
fan6 OK Fan
fan9 OK Fan
fan8 OK Fan

System services and devices ignore list

Name Status Type
----------- -------- ------
psu.voltage Ignored Device
```
Go Back To [Beginning of the document](#) or [Beginning of this section](#System-Health)

## VLAN & FDB

Expand Down
5 changes: 4 additions & 1 deletion show/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,15 @@
import mlnx
import utilities_common.cli as clicommon
import vlan
import system_health

from sonic_py_common import device_info
from swsssdk import ConfigDBConnector, SonicV2Connector
from tabulate import tabulate
from utilities_common.db import Db
import utilities_common.multi_asic as multi_asic_util


# Global Variables
PLATFORM_JSON = 'platform.json'
HWSKU_JSON = 'hwsku.json'
Expand Down Expand Up @@ -125,6 +128,7 @@ def cli(ctx):
cli.add_command(interfaces.interfaces)
cli.add_command(kube.kubernetes)
cli.add_command(vlan.vlan)
cli.add_command(system_health.system_health)

#
# 'vrf' command ("show vrf")
Expand Down Expand Up @@ -2429,6 +2433,5 @@ def tunnel():

click.echo(tabulate(table, header))


if __name__ == '__main__':
cli()
Loading