-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRAFT: Independent DPU Upgrade HLD #1906
base: master
Are you sure you want to change the base?
Conversation
/azp run |
No pipelines are associated with this pull request. |
Hi @hdwhdw - are you looking for a Reviewer for this PR? |
/azp run |
No pipelines are associated with this pull request. |
/azp run |
No pipelines are associated with this pull request. |
@KrisNey-MSFT yes please. Thank you :) |
/azp run |
No pipelines are associated with this pull request. |
/azp run |
No pipelines are associated with this pull request. |
### 2. Scope | ||
|
||
This document describes the high-level design of the sequence to independently upgrade a SmartSwitch DPU with minimal impact to other DPUs and the NPU, through GNOI API. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly define dependencies and any ordering constraints to prevent unexpected failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Let me know if there are any other dependencies.
* 'System.SetPackage' | ||
* 'OS.Activate' | ||
* 'Containerz.Deploy' | ||
* Rollback: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a sentence for that. If Activate fail, we should SetPackage (install) the previous image and Activate again.
/azp run |
No pipelines are associated with this pull request. |
/azp run |
No pipelines are associated with this pull request. |
/azp run |
No pipelines are associated with this pull request. |
3. DPU and NPU image compatibility: The upgrade process assumes that the DPU and NPU images are compatible with each other. It is up to the client to ensure the compatibility of the images. | ||
4. Eliminating human intervention: The upgrade process may require human intervention to resolve issues that cannot be handled automatically, in particular, when both the upgrade process fails and the rollback process fails, the system may be left in an inconsistent state that requires manual intervention. | ||
|
||
### 6. Architecture Degn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Architecture Design
* 'Containerz.Deploy' | ||
* Rollback: | ||
* Rollback the new SONiC image on the DPU. Client issues 'OS.Activate' with the old SONiC image. | ||
* Rollback the new offloaded container images on the NPU. Client issues 'Containerz.RemoveImage' with the old container images. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why rollback new offloaded img will issue RemoveImage with old img?
This PR contains a draft HLD for DPU Independent Upgrade.
What we did:
Supports Independent smartswitch DPU upgrade.
Why we did it:
Supports managing DPU SONiC version indepedently in smartswitch.
PRs and States: