Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PIT HLD document. #1014

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 164 additions & 0 deletions doc/pit/Platform_Integration_Test_high_level_design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# Platform Integration Test High Level Design

### Document History

Date Version Authors Description
2022/03/01 V0.1 Clark Lee (guizhao.lh@alibaba-inc.com) Initial version
2022/04/29 V0.2 Clark Lee (guizhao.lh@alibaba-inc.com) Detail added for platform plugins, RESTful APIs

# Scope

This document describes a subsystem for Platform Integration Test, aka. PIT. PIT system concentrates on standardize and automate white-box switch hardware functionalities verification, along with their driver and firmware. As a result, it will be easier to port SONiC on a new white-box platform which is PIT-verified. Current scope covers all hardware components, such as CPU, memory, SSD, power system, fan, system sensors of various kind, and logical device firmware management, BMC subsystem etc.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does PIT compares to PMON? what is the problem in PMON that we are trying to address with PIT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lihuay
That's a good question and I'm trying to summary it briefly as below:

Q1: PIT and PMON are different in several aspects.

  1. PIT is a tool while PMON is a monitoring daemon
    PIT is a tool for verification in the stage of development, manufacturing and acceptance test in delivery, it's not a running process(daemon) as PMON(regular container). PIT won't (and should not) be run the the stage of production.
  2. PIT run before production deployment while PMON runs in production
    PMON is created and works in the running stage for switch system without BMC, it is used for platform monitoring while PIT is not used as monitoring tool.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does PIT compares to PMON? what is the problem in PMON that we are trying to address with PIT?
@lihuay
The second question:
What issue PIT is addressing?

PIT is trying to solve 1 core issue: users tells the ODM partners how they verify the hardware+driver+firmware. In this way, both users and ODMs workloads are reduced as many duplicated requirement discussion, development detail(hardware+driver functionality, testing requirement) can be expressed by test logic(in codes).


# 1. Platform Integration Test (aka. PIT) background

In the blooming SONiC community, lots of users build their software on open hardware(especially white-box switches). As the devices hardware are built by ODM vendor, not end users (users' SONiC development team), it's necessary to have a systematic method to verify whether the hardware, firmware (such as CPLD, FPGA, BMC, BIOS) and their drivers meet the users' requirement. At this point, PIT system (Platform Integration Test) is introduced for this purpose.

PIT system could reduce workload for both ODM and end-user because it not only automates hardware, firmware, driver verification efforts, but also it acheive this in a platform independent way, as it provide a set of platform APIs to unify platform functions, so that hardware from different ODM have a unique view to the users. PIT is the first standard target to simplified SONiC porting on white-box switch in SONiC community. The simplified porting job may attract more and more user and ODM partners to join the SONiC community.

# 2. Platform Integration Test subsystem overview

## 2.1 PIT feature overview

PIT is a sub-system, dedicated for the verification of white-box switch hardware, firmware, and driver.
PIT provides:

1) An automation test framework for platform hardware, firmware, and driver test;
2) A set of test cases written in platform-independent (CPU, ASIC, product) manner;
3) A set of platform APIs to provide hardware function abstraction (complementary to SONiC platform APIs), which help test case logic platform-independent.

## 2.2 PIT deploy scenario

Figure2-1 shows where PIT is used.

![](images/PIT_subsystem_overview.png)
Figure 2-1

PIT runs in SONiC variant(which is called D4OS, aka. Datacenter Development Diagnostic and Delivery OS, SONiC based, tailored to focus on the developing process of a switch product, co-develop by ODM and users, added users' tools for delivery process, system default configuation, etc). It can be deployed in 3 stage of a switch device developing time axis. Development stage and manufacturing stage which take place in ODM side. Delivery stage which happen on the IDC users. The verified hardware drivers can be added and run in the user SONiC system without changes.
Copy link
Collaborator

@lihuay lihuay Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid creating new brand names. -- I'm referring to the D4OS name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lihuay Thanks for the advise. So, should I change back to branch master and redo this PR?

Copy link
Collaborator

@lihuay lihuay Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you need redo the PR against the master branch since the current PR is not for any branches.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you need redo the PR against the master branch since the current PR is not for any branches.

Please check this PR, changed the branch to master.
#1127


### 2.2.1 Development test

This happens on the ODM vendor side, product development stage. PIT can be used to:

1) Run as daily regression test to check if any new feature breaks the other feature.
2) Run in a stress test for any given function.
3) Run in environment reliability test, check if environment changes (such as high/low temperature, humidity changes) affect device functions.

### 2.2.2 Manufacturing test

This happens on the ODM vendor side, product manufacturing stage. PIT can be used to:

1) Run in the process of manufacturing, PIT test can automatically test the devices, with exactly the test logic from user requirement, avoiding mis-understanding between users' requirement and ODM implementation.
2) Provide users' minimum test coverage.

### 2.2.3 Delivery test

This happens on the user side, product delivery stage. PIT can be used to:

1) Provide a fast and automated test for the product, checking the basic functionality of the device hardware, firmware and drivers.
2) Provide a structured data as the result of the test, and it can be easily checked by the users' acceptance checking system.

# 3. PIT system design

## 3.1 System view

PIT is supposed to run in a docker, as a component of SONiC. It can be enable/disable in the build process. Also, it's a docker, so it can be deploy/upgrade flexibly on demands.

## 3.2 Objective

* PIT should run in different scenario, with simple user input.
* PIT should use exactly the same test logic for different product.
* PIT should generate standard outputs for specific test item.
* PIT should be extensible.

## 3.3 PIT software architecture

Figure 3-1 shows the high-level software architecture of PIT.
![](images/PIT_software_architecture.png)
Figure 3-1

PIT main process parses and interpret user input command and options, invoke test case list generator to generate test case list. The later reads platform configuration and test case congiuration, traverse test case set and selects those cases with desired tags, then generates final list of test cases. Then main process iterates each case selected, run it, then invoke test result generator to generate test result. After all test cases in the list finished, result generator gother them and form a collective final result. The result is passed to users' automation system for acceptance test.

Besides auto generated test cases list, PIT also support single case or user specific case list as input.

## 3.4 PIT component detailed
Figure 3-2 shows the softawre component of PIT.
![](images/PIT_conponents.png)

### 3.4.1 Test cases list generator

Test case list generator is used to generate a list of test cases, it parses users' input options, product specific platform configuration, and generate test case list from all test case database. This is used in the daily regression test and delivery test scenario.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to reuse the current test framework in the sonic-mgmt repositoty? If the new cases can be added there it will also benefit those who use the sonic-mgmt test cases as SONiC daily regression.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's possible to add to sonic-mgmt repository, with some extra effort. Anyone want to take this piece of job is more than welcome.
As PIT is proposed to be used by ODM vendor in development and manufacturing stage, that's an early stage of a device lifecycle, it is designed to be a stand-alone component without sonic-mgmt test framework(it's much difficult to get sonic-mgmt deployed in these scenario, as we focus on a per device basis).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Kebo. It's much easier to put the tests together.

I'm not following on why standalone test is difficult.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Kebo. It's much easier to put the tests together.

I'm not following on why standalone test is difficult.

The most important reason is ODM vendor cannot have access to SAI unless they pay extra money while we(users) have it, especially for new ASIC. In this situation, they cannot have the whole sonic running, this is a reality for most smaller ODM vendors.


Each test case has a list of tags, denoting type, stage, etc. Current tag for stage includes: production, development, delivery. Tags for type include: auto, manual, utility. Other tags can be added, to provide extra information so that they can be grouped together for new scenario.

### 3.4.2 Manual test options

Alternatively, user can specify which case or a list of test cases to run. Optionally, users are able to specify how many times to run the cases. This is used in development stage for developers to debug/test a single feature, or be used in stress test scenario.

### 3.4.3 Test result generator

Test result is generated per case. There are 2 result formats:

1) Formatted strings, contains test case name, ID, result, type, etc. This standardized output is used in manufacturing process, so that output strings can be mapped to hardware/software/firmware/configuration fault easily.
2) Structured result, per current design, we're using JSON format, as it's commonly used and can be interfaced with users' automated acceptance checking system in the product delivery process.

Besides test case result, a collective result is also generated, as a summary of the whole test. Similar to per test case result, it composes of standardized output string and structured data format (JSON).

Figure 3-3 shows an example of final test result JSON.
![](images/PIT_final_test_result.png)
Figure 3-3

### 3.4.4 Test case logic

Figure 3-4 shows the test case logic and interaction
![](images/PIT_test_case_logic.png)
Figure 3-4

Each test case has a configuration file, containing information such as test case name, description, type, and tag list.
Each platform has a platform configuration file, containing platform specific (product specific) information such as CPU architecture, boot loader, CPU error monitor capability, peripheral related information (e.g. FAN number, PSU control options, etc.).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an example of the platform configuration file?

I am concerned that this will be yet another configuration file that platform vendors will have to generate and maintain when bringing platforms in to SONiC. Can the platform configuration be derived from already-existing platform configs?
https://github.com/Azure/sonic-buildimage/blob/master/device/dell/x86_64-dell_s6100_c2538-r0/platform.json is an example of an existing platform config.

There is also the PDDF feature that was contributed by Broadcom that provides structured platform config data as well. Example:
https://github.com/Azure/sonic-buildimage/blob/master/device/accton/x86_64-accton_as7712_32x-r0/pddf/pddf-device.json

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jeff,

Case configuration is for test case, not imply any implementation of the switch. an example of fan test case:
{
"name": "fan-test",
"description": "Check fan status",
"type": "auto",
"tags": ["manufacture", "delivery", "pa", "emc"]
}
As above, the fan test case is tagged with list, those tags are used to generate test case list for certain scenario. Like, "manufacture" means this case should be include in manufacturing test in manufacturing process.

PIT is totally different from PDDF/PDK stuff, PIT is a verify system.

Logic of each test case should be platform-independent. From figure 3-3, test logic configures or fetch information using platform APIs. These platform APIs' implementation should be standardized if the drivers are standardized(optional). In case of some system with BMC integrated, platform APIs will access this information via a set of RESTful APIs to BMC. Test case will load per case configuration and get platform specific information during test. These ensure the test logic itself platform independent.

# 4. Platform plugin APIs

## 4.1 Extended platform plugins

Platform plugins are intended to provide abstraction for device function, in order to test full functionality, we extend some APIs for testing purpose. Most of them are not intended to be used in a running deployment , they are defined to have fine grain test only. Extended/added plugins are as follow:

* device/$platform_name/plugins/sfputil.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to extend the platform APIs instead of extending the plugins? platform plugs are phased out, many platforms don't support them anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in phase 2 we may support platform APIs. Platform plugins is supported to back our massive deployment and a lot of users are still using platform plugins, that's why our first step extends plugins rather than platform APIs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to clarify the plan in the document itself. Incremental deliverable is OK, but let's share the plan.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to clarify the plan in the document itself. Incremental deliverable is OK, but let's share the plan.

Thanks, I'll add it to HLD.

* device/$platform_name/plugins/fanutil.py
* device/$platform_name/plugins/psuutil.py
* device/$platform_name/plugins/sensorutil.py
* device/$platform_name/plugins/fwmgrutil.py
* device/$platform_name/plugins/led_control.py
* device/$platform_name/plugins/bmcutil.py
* device/$platform_name/plugins/mgmt.port.py
* device/$platform_name/plugins/fruidutil.py

# 5. BMC component

## 5.1 System with BMC support

Some systems are built along with a BMC subsystem. The BMC subsystem provide extra functionality for management/maintenances. Usually, systems with BMC may offload some hardware monitoring feature, power management feature, firmware management feature to BMC. As such, it is necessary to have a standardized way to communicate between SONiC and BMC subsystem. Hence, we define a set of RESTful APIs, so that the plugins implementation can be unified.

Figure 5-1
![](images/PIT_BMC_component.png)

## 5.2 RESTful APIs

These RESTful APIs are defined to control various hardware component through BMC. Because these RESTful APIs are standardized, the platform plugins may have the same implementation over these functions.

* api/fan/info
* api/fan/number
* api/psu/info
* api/psu/number
* api/sensor/info
* api/bmc/info
* api/bmc/nextboot
* api/bmc/reboot
* api/bmc/status
* api/firmware/biosnextboot
* api/firmware/upgrade
* api/firmware/refresh
* api/firmware/cpldversion
* api/misc/biosbootstatus
* api/bmc/raw
Binary file added doc/pit/images/PIT_BMC_component.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/pit/images/PIT_conponents.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/pit/images/PIT_final_result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/pit/images/PIT_software_architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/pit/images/PIT_subsystem_overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/pit/images/PIT_test_case_logic.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.