-
Notifications
You must be signed in to change notification settings - Fork 760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smartswitch Platform Test Plan Document #12701
Conversation
|
We reviewed the test cases today 5/8/2024. One comment, please change to SONiC-DASH OS for the DPU, and SONiC only for the CPU/NPU :) |
* The "show reboot-cause history module-name" CLI on the switch shows the history of the specified module | ||
* Use `config chassis modules shutdown <DPU_Number>` | ||
* Use `config chassis modules startup <DPU_Number>` | ||
* Wait for 5 minutes for Pmon to update the dpu states |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to wait 5 minutes? what is the max time until the dpu states is updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Considering power on dpu, service to be up on dpu and chassis db update, we had given the 5 mins to be max limit.
- This time limit is for initial boot up case and for subsequent operation state updates are going to be instantaneous.
### 1.8 Check the NTP date and timezone between DPU and NPU | ||
|
||
#### Steps | ||
* In Switch, under the file /etc/ntp.conf configure it to use the ntp server and restart ntp.service to configure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NTP configuration should be set via config DB. An example of the configuration is in https://github.com/sonic-net/SONiC/blob/master/doc/ntp/ntp-design.md HLD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. This test case is just to check that both NPU and DPU are in sync with the dates. Nothing to do with any configurations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please update the steps? Because the first step here is describes that the configuration will be set:
under the file /etc/ntp.conf configure it to use the ntp server and restart ntp.service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it.
|
||
#### Steps | ||
* In Switch, under the file /etc/ntp.conf configure it to use the ntp server and restart ntp.service to configure | ||
* In DPU, similarly under the ntp configuration use the switches ip as ntp server and restart ntp service to configure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that SONiC on the switch should run NTP server? The support of the NTP servers is not yet integrated into the SONiC. It should be possible to configure the NTP server via Linux config files but this configuration might conflict with the NTP client configuration that SONiC supports.
If we want to run the NTP server on the switch we need to discuss this with the Microsoft team. @prgeor can you please assist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is nothing to do with any configuration. This is to check just the date and time zones are all same both on host and dpus. Changed the test case as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please update the steps? The steps tell opposite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it.
The default will be T1 roles. There is a plan to use them as T0 switches as well. |
hi @prgeor - are you ok to merge this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@nissampa SONiC does not have BMC support. Can you remove BCM from the PR description? |
Removed it. |
root@sonic:/home/cisco# show chassis modules status | ||
Name Description Physical-Slot Oper-Status Admin-Status Serial | ||
------ ------------- --------------- ------------- -------------- -------- | ||
DPU0 N/A -1 Online up N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nissampa Please update the output of the CLI
- Physical slot should be NA for DPU
- Serial Number should not be NA at least not when DPU is online
- Description should be "Data Processing Unit"
DPUX N/A -1 Online up N/A | ||
``` | ||
#### Pass/Fail Criteria | ||
* Verify number of DPUs from api and number of DPUs shown in the cli output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nissampa the source of truth about number of DPUs should ideally come from
Ansible inventory file for the testbed. We cannot rely upon the APIs which are under test.
The inventory file should specify how many DPUs are expected in a testbed.
https://github.com/sonic-net/sonic-mgmt/blob/master/tests/common/utilities.py#L341 parses the inventory.
Example. System eeprom info for a testbed is fetched from inventory here:-https://github.com/sonic-net/sonic-mgmt/blob/master/tests/platform_tests/api/test_chassis.py#L249
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the source of truth to be picked from inventory file instead of from the api.
root@sonic:/home/cisco# | ||
|
||
``` | ||
#### Pass/Fail Criteria |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nissampa What is expected for DPU that are offline/admin down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will display the o/p as 0 for DPUs that are offline/admin down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nissampa can you capture the output for this CLI on the DPU . This will set the expectation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed in the mail thread that we are focussing on NPU side first.
``` | ||
On Switch: | ||
|
||
root@sonic:/home/cisco# show platform voltage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nissampa What is the expected output when this CLI runs on the DPU host?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will display the ones that are respective to those DPU host.
@nissampa could you please update the PR description with the corresponding sonic-mgmt code PR? |
Updated it. |
Woot! |
* Create DPU-test-plan.md * Rename DPU-test-plan.md to Smartswitch-test-plan.md * Update Smartswitch-test-plan.md
* Create DPU-test-plan.md * Rename DPU-test-plan.md to Smartswitch-test-plan.md * Update Smartswitch-test-plan.md
* Create DPU-test-plan.md * Rename DPU-test-plan.md to Smartswitch-test-plan.md * Update Smartswitch-test-plan.md
Description of PR
The smartSwitch is a next generation of data center switch for T0/T1 roles, that now subsumes the DPU. This PR describes test cases to validate additional platform management functions such FPD, Console, Power mgmt., Health, Software upgrade, Life-cycle scenarios needed due to the presence of these DPUs in the system.
PR Link to Test Case Scripts: #14152
Back port request