From 6165d57b97d98ff108707ae6d2cec5417027c6ff Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Wed, 21 Jun 2023 19:49:42 -0700 Subject: [PATCH 01/13] Add sff_mgr doc --- .../Interface-Link-bring-up-sequence.md | 39 ++++++++++++++++++- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md index 9a993ad6cca..ab372de86bc 100644 --- a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md +++ b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md @@ -149,11 +149,13 @@ Please refer to the flow/sequence diagrams which covers the following required - No transceiver present # Feature enablement + This feature (optics Interface Link bring-up sequence) would be enabled on per platform basis. There could be cases where vendor(s)/platform(s) may take time to shift from existing codebase to the model (work-flows) described in this document. +## For CMIS/C-CMIS modules: In order to avoid any breakage and ensure gradual migration of different platforms/vendors to this model, will add this new workflow to enable/disable this feature: - In order to enable this feature, the platform would set ‘skip_xcvrd_cmis_mgr’ to ‘false’ in their respective pmon_daemon_control.json as part of platform bootstrap. When xcvrd would spawn on that hwsku (LC/board), it would parse ‘skip_xcvrd_cmis_mgr’ and if found 'false', it would launch CMIS task manager. This implies enabling this feature. + In order to enable this feature, the platform would set ‘skip_xcvrd_cmis_mgr’ to ‘false’ in their respective pmon_daemon_control.json as part of platform bootstrap. When xcvrd would spawn on that hwsku (LC/board), it would parse ‘skip_xcvrd_cmis_mgr’ and if found 'false', it would launch CMIS task manager. This implies enabling this feature. Else, if ‘skip_xcvrd_cmis_mgr’ is set/found 'true' by xcvrd, it would skip launching CMIS task manager and this feature would remain disabled. If a platform/vendor does not specify/set ‘skip_xcvrd_cmis_mgr’, xcvrd would exercise the default workflow (i.e. when xcvrd detects QSFP-DD, it would luanch CMIS task manager and initialize the module per CMIS specification). @@ -163,7 +165,11 @@ Note: This feature flag (skip_xcvrd_cmis_mgr) was added as a flexibility in case Workflow : ![Enabling 'Interface link bring-up sequence' feature(2)](https://user-images.githubusercontent.com/69485234/154403945-654b49d7-e85f-4a7a-bb4d-e60a16b826a7.png) - +## For SFF compliant modules: +Similarly, in order to enable this feature for SFF compliant modules, the platform would set ‘enable_xcvrd_sff_mgr’ to ‘true’ in their respective pmon_daemon_control.json. Xcvrd would parse ‘skip_xcvrd_cmis_mgr’ and if found 'true', it would launch SFF task manager. This implies enabling this feature. +If a platform/vendor does not specify/set ‘enable_xcvrd_sff_mgr’, xcvrd would not enable this feature, no deterministic bring-up flow. +> **_NOTE:_** +There is a behavior change (and requirement) for the platforms that enable this sff_mgr feature: platform needs to keep TX in disabled state after module coming out-of-reset, in either module insertion or bootup cases. This is to make sure the module is not transmitting with TX enabled before host_tx_ready is True. Before enabling this feature, platform needs to follow this behavior and verify it fully. There's no impact for the platforms that don't enable it, and no impact for the systems in current deployment, since they didn't set the enable_flag explictly. # Transceiver Initialization (at platform bootstrap layer) @@ -184,6 +190,35 @@ if transceiver is not present: - All the workflows mentioned above will reamin same ( or get exercised) till host_tx_ready field update - xcvrd will not perform any action on receiving host_tx_ready field update +# Flow chart +## SFF task manager (sff_mgr) +```mermaid +graph TD; +A[check if sff_mgr is enabled] +B[spawn sff_mgr] +C[subscribe to events] +D[while task_stopping_event is not set] +E[check for insertion event and host_tx_ready change event] +F[double check if module is present] +G[calculate the target tx_disable value based on host_tx_ready] +H[check if tx_disable status on module is already the target value] +I[go ahead to enable/disable TX based on the target tx_disable value] + +Start --> A +A -- true --> B +B --> C +C --> D +D -- true --> E +E -- if either happened --> F +E -- if neither happened --> D +F --> G +G --> H +H -- true --> D +H -- false --> I +I --> D +D -- false --> End +A -- false --> End +``` # Out of Scope Following items are not in the scope of this document. They would be taken up separately From 76caa4d5712e8f80f871d04f0739b93b0af52033 Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Thu, 22 Jun 2023 11:11:50 -0700 Subject: [PATCH 02/13] Update chart --- doc/sfp-cmis/Interface-Link-bring-up-sequence.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md index ab372de86bc..9d561e1e260 100644 --- a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md +++ b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md @@ -211,7 +211,8 @@ C --> D D -- true --> E E -- if either happened --> F E -- if neither happened --> D -F --> G +F -- true --> G +F -- false --> D G --> H H -- true --> D H -- false --> I From f502f6c0083cf64f5e5ca426efdfc886fa015e5e Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Thu, 22 Jun 2023 13:38:21 -0700 Subject: [PATCH 03/13] Update chart --- .../Interface-Link-bring-up-sequence.md | 58 +++++++++++-------- 1 file changed, 35 insertions(+), 23 deletions(-) diff --git a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md index 9d561e1e260..57897f600f6 100644 --- a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md +++ b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md @@ -191,34 +191,46 @@ if transceiver is not present: - xcvrd will not perform any action on receiving host_tx_ready field update # Flow chart -## SFF task manager (sff_mgr) +## How Xcvrd main thread spawns SFF task manager thread: ```mermaid graph TD; -A[check if sff_mgr is enabled] -B[spawn sff_mgr] -C[subscribe to events] -D[while task_stopping_event is not set] -E[check for insertion event and host_tx_ready change event] -F[double check if module is present] -G[calculate the target tx_disable value based on host_tx_ready] -H[check if tx_disable status on module is already the target value] -I[go ahead to enable/disable TX based on the target tx_disable value] +A[wait for PortConfigDone] +B[check if enable_sff_mgr flag exists and is set to true] +C[spawns sff_mgr] +D[proceed to other thread spawning] Start --> A -A -- true --> B -B --> C +A --> B +B -- true --> C C --> D -D -- true --> E -E -- if either happened --> F -E -- if neither happened --> D -F -- true --> G -F -- false --> D -G --> H -H -- true --> D -H -- false --> I -I --> D -D -- false --> End -A -- false --> End +B -- false --> D +D --> End +``` +## SFF task manager main flow: +```mermaid +graph TD; +A[subscribe to events] +B[while task_stopping_event is not set] +C[check for insertion event and host_tx_ready change event] +D[double check if module is present] +E[fetch DB and update host_tx_ready value in local cahce, if not available locally] +F[calculate the target tx_disable value based on host_tx_ready] +G[check if tx_disable status on module is already the target value] +H[go ahead to enable/disable TX based on the target tx_disable value] + +Start --> A +A --> B +B -- true --> C +C -- if either happened --> E +C -- if neither happened --> B +E --> D +D -- true --> F +D -- false --> B +F --> G +G -- true --> B +G -- false --> H +H --> B +B -- false --> End ``` # Out of Scope From 016ea40bfa296049bcd952ebaa9292cf57996894 Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Thu, 22 Jun 2023 13:40:02 -0700 Subject: [PATCH 04/13] Update chart --- doc/sfp-cmis/Interface-Link-bring-up-sequence.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md index 57897f600f6..294a1444693 100644 --- a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md +++ b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md @@ -196,8 +196,8 @@ if transceiver is not present: graph TD; A[wait for PortConfigDone] B[check if enable_sff_mgr flag exists and is set to true] -C[spawns sff_mgr] -D[proceed to other thread spawning] +C[spawn sff_mgr] +D[proceed to other thread spawning and tasks] Start --> A A --> B From 1097c5848480d5ce38212f5a52d6886563bf57d9 Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Fri, 23 Jun 2023 16:09:38 -0700 Subject: [PATCH 05/13] Addresse comments --- .../Interface-Link-bring-up-sequence.md | 60 +++++++++---------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md index 294a1444693..fabed531dec 100644 --- a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md +++ b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md @@ -166,32 +166,12 @@ Note: This feature flag (skip_xcvrd_cmis_mgr) was added as a flexibility in case ![Enabling 'Interface link bring-up sequence' feature(2)](https://user-images.githubusercontent.com/69485234/154403945-654b49d7-e85f-4a7a-bb4d-e60a16b826a7.png) ## For SFF compliant modules: -Similarly, in order to enable this feature for SFF compliant modules, the platform would set ‘enable_xcvrd_sff_mgr’ to ‘true’ in their respective pmon_daemon_control.json. Xcvrd would parse ‘skip_xcvrd_cmis_mgr’ and if found 'true', it would launch SFF task manager. This implies enabling this feature. -If a platform/vendor does not specify/set ‘enable_xcvrd_sff_mgr’, xcvrd would not enable this feature, no deterministic bring-up flow. -> **_NOTE:_** -There is a behavior change (and requirement) for the platforms that enable this sff_mgr feature: platform needs to keep TX in disabled state after module coming out-of-reset, in either module insertion or bootup cases. This is to make sure the module is not transmitting with TX enabled before host_tx_ready is True. Before enabling this feature, platform needs to follow this behavior and verify it fully. There's no impact for the platforms that don't enable it, and no impact for the systems in current deployment, since they didn't set the enable_flag explictly. - -# Transceiver Initialization - (at platform bootstrap layer) - -![LC boot-up sequence - optics INIT (platform bootstrap)](https://user-images.githubusercontent.com/69485234/152261613-e20dcda9-2adc-42aa-a1f1-4b8a47dd32af.png) - -# Applying 'interface admin startup' configuration - -![LC boot-up sequence - 'admin enable' Config gets applied](https://user-images.githubusercontent.com/69485234/147166867-56f3e82d-1b1c-4b7a-a867-5470ee6050e7.png) - - -# Applying 'interface admin shutdown' configuration - -![LC boot-up sequence - 'admin disable' Config gets applied](https://user-images.githubusercontent.com/69485234/147166884-92c9af48-2d64-4e67-8933-f80531d821b4.png) - -# No transceiver present -if transceiver is not present: - - All the workflows mentioned above will reamin same ( or get exercised) till host_tx_ready field update - - xcvrd will not perform any action on receiving host_tx_ready field update - -# Flow chart -## How Xcvrd main thread spawns SFF task manager thread: +- SFF task manager (sff_mgr) feature brings the deterministic approach for interface link bring-up to SFF compliant module. +- By default, sff_mgr feature is disabled. +- In order to enable sff_mgr feature, the platform would set ‘enable_xcvrd_sff_mgr’ to ‘true’ in their respective pmon_daemon_control.json. Xcvrd would parse ‘enable_xcvrd_sff_mgr’ and if found 'true', it would launch SFF task manager (sff_mgr). +> **_Pre-requisite for enabling sff_mgr:_** +Platform needs to leave the transceiver (if capable of disabling TX) in TX disabled state when an module inserted or during boot-up. This is to make sure the transceiver is not transmitting with TX enabled before host_tx_ready is True. +### Flow of xcvrd main thread spawning sff_mgr thread: ```mermaid graph TD; A[wait for PortConfigDone] @@ -206,12 +186,13 @@ C --> D B -- false --> D D --> End ``` -## SFF task manager main flow: +### Flow of sff_mgr: +(```tx_disable value/status``` is: ```True``` if TX is disabled; ```False``` if TX is enabled) ```mermaid graph TD; A[subscribe to events] B[while task_stopping_event is not set] -C[check for insertion event and host_tx_ready change event] +C[check for insertion event and host_tx_ready change event for the intended ports] D[double check if module is present] E[fetch DB and update host_tx_ready value in local cahce, if not available locally] F[calculate the target tx_disable value based on host_tx_ready] @@ -221,8 +202,8 @@ H[go ahead to enable/disable TX based on the target tx_disable value] Start --> A A --> B B -- true --> C -C -- if either happened --> E -C -- if neither happened --> B +C -- if either event happened --> E +C -- if neither event happened --> B E --> D D -- true --> F D -- false --> B @@ -233,6 +214,25 @@ H --> B B -- false --> End ``` +# Transceiver Initialization + (at platform bootstrap layer) + +![LC boot-up sequence - optics INIT (platform bootstrap)](https://user-images.githubusercontent.com/69485234/152261613-e20dcda9-2adc-42aa-a1f1-4b8a47dd32af.png) + +# Applying 'interface admin startup' configuration + +![LC boot-up sequence - 'admin enable' Config gets applied](https://user-images.githubusercontent.com/69485234/147166867-56f3e82d-1b1c-4b7a-a867-5470ee6050e7.png) + + +# Applying 'interface admin shutdown' configuration + +![LC boot-up sequence - 'admin disable' Config gets applied](https://user-images.githubusercontent.com/69485234/147166884-92c9af48-2d64-4e67-8933-f80531d821b4.png) + +# No transceiver present +if transceiver is not present: + - All the workflows mentioned above will reamin same ( or get exercised) till host_tx_ready field update + - xcvrd will not perform any action on receiving host_tx_ready field update + # Out of Scope Following items are not in the scope of this document. They would be taken up separately 1. xcvrd restart From a0f03b2fdf37d3e5c40c4eca16ccb2e3111fb8bf Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Fri, 23 Jun 2023 16:48:17 -0700 Subject: [PATCH 06/13] Add link to reasons --- doc/sfp-cmis/Interface-Link-bring-up-sequence.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md index fabed531dec..a817ff10f09 100644 --- a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md +++ b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md @@ -166,7 +166,7 @@ Note: This feature flag (skip_xcvrd_cmis_mgr) was added as a flexibility in case ![Enabling 'Interface link bring-up sequence' feature(2)](https://user-images.githubusercontent.com/69485234/154403945-654b49d7-e85f-4a7a-bb4d-e60a16b826a7.png) ## For SFF compliant modules: -- SFF task manager (sff_mgr) feature brings the deterministic approach for interface link bring-up to SFF compliant module. +- SFF task manager (sff_mgr) feature brings the deterministic approach for interface link bring-up to SFF compliant modules. (Refer to [here](#plan) for the reasons of sff_mgr) - By default, sff_mgr feature is disabled. - In order to enable sff_mgr feature, the platform would set ‘enable_xcvrd_sff_mgr’ to ‘true’ in their respective pmon_daemon_control.json. Xcvrd would parse ‘enable_xcvrd_sff_mgr’ and if found 'true', it would launch SFF task manager (sff_mgr). > **_Pre-requisite for enabling sff_mgr:_** From adad6a75a6126488d253f499615fc3c265ca577d Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Fri, 23 Jun 2023 16:50:57 -0700 Subject: [PATCH 07/13] Minor change --- doc/sfp-cmis/Interface-Link-bring-up-sequence.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md index a817ff10f09..d2352a00a52 100644 --- a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md +++ b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md @@ -192,7 +192,7 @@ D --> End graph TD; A[subscribe to events] B[while task_stopping_event is not set] -C[check for insertion event and host_tx_ready change event for the intended ports] +C[check for insertion event and host_tx_ready change event for each intended port] D[double check if module is present] E[fetch DB and update host_tx_ready value in local cahce, if not available locally] F[calculate the target tx_disable value based on host_tx_ready] From 30a659ab05ed140272a69e4f22b82c8edd8935ac Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Mon, 26 Jun 2023 13:27:09 -0700 Subject: [PATCH 08/13] Remove empty space change --- doc/sfp-cmis/Interface-Link-bring-up-sequence.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md index d2352a00a52..2211bc84886 100644 --- a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md +++ b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md @@ -149,13 +149,12 @@ Please refer to the flow/sequence diagrams which covers the following required - No transceiver present # Feature enablement - This feature (optics Interface Link bring-up sequence) would be enabled on per platform basis. There could be cases where vendor(s)/platform(s) may take time to shift from existing codebase to the model (work-flows) described in this document. ## For CMIS/C-CMIS modules: In order to avoid any breakage and ensure gradual migration of different platforms/vendors to this model, will add this new workflow to enable/disable this feature: - In order to enable this feature, the platform would set ‘skip_xcvrd_cmis_mgr’ to ‘false’ in their respective pmon_daemon_control.json as part of platform bootstrap. When xcvrd would spawn on that hwsku (LC/board), it would parse ‘skip_xcvrd_cmis_mgr’ and if found 'false', it would launch CMIS task manager. This implies enabling this feature. + In order to enable this feature, the platform would set ‘skip_xcvrd_cmis_mgr’ to ‘false’ in their respective pmon_daemon_control.json as part of platform bootstrap. When xcvrd would spawn on that hwsku (LC/board), it would parse ‘skip_xcvrd_cmis_mgr’ and if found 'false', it would launch CMIS task manager. This implies enabling this feature. Else, if ‘skip_xcvrd_cmis_mgr’ is set/found 'true' by xcvrd, it would skip launching CMIS task manager and this feature would remain disabled. If a platform/vendor does not specify/set ‘skip_xcvrd_cmis_mgr’, xcvrd would exercise the default workflow (i.e. when xcvrd detects QSFP-DD, it would luanch CMIS task manager and initialize the module per CMIS specification). @@ -233,6 +232,7 @@ if transceiver is not present: - All the workflows mentioned above will reamin same ( or get exercised) till host_tx_ready field update - xcvrd will not perform any action on receiving host_tx_ready field update + # Out of Scope Following items are not in the scope of this document. They would be taken up separately 1. xcvrd restart From aab423bd5662e2ffb546e55fae5345d5e41f45ca Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Mon, 26 Jun 2023 13:31:34 -0700 Subject: [PATCH 09/13] Add version --- doc/sfp-cmis/Interface-Link-bring-up-sequence.md | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md index 2211bc84886..8512e56a540 100644 --- a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md +++ b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md @@ -34,6 +34,7 @@ Deterministic Approach for Interface Link bring-up sequence | 0.7 | 02/02/2022 | Jaganathan Anbalagan | Added Breakout Handling | 0.8 | 02/16/2022 | Shyam Kumar | Updated feature-enablement workflow | 0.9 | 04/05/2022 | Shyam Kumar | Addressed review comments | +| 0.10| 06/26/2023 | Longyin Huang | Added details for sff_mgr | # About this Manual From eaa653d18b54742b277db653c80dca0e819a87df Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Wed, 12 Jul 2023 15:30:43 -0700 Subject: [PATCH 10/13] Move sff_mgr contents to separate file under doc/xcvrd --- .../Interface-Link-bring-up-sequence.md | 51 +---- ...e-Link-bring-up-sequence-on-sff-modules.md | 180 ++++++++++++++++++ 2 files changed, 181 insertions(+), 50 deletions(-) create mode 100644 doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md diff --git a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md index 8512e56a540..9a993ad6cca 100644 --- a/doc/sfp-cmis/Interface-Link-bring-up-sequence.md +++ b/doc/sfp-cmis/Interface-Link-bring-up-sequence.md @@ -34,7 +34,6 @@ Deterministic Approach for Interface Link bring-up sequence | 0.7 | 02/02/2022 | Jaganathan Anbalagan | Added Breakout Handling | 0.8 | 02/16/2022 | Shyam Kumar | Updated feature-enablement workflow | 0.9 | 04/05/2022 | Shyam Kumar | Addressed review comments | -| 0.10| 06/26/2023 | Longyin Huang | Added details for sff_mgr | # About this Manual @@ -152,7 +151,6 @@ Please refer to the flow/sequence diagrams which covers the following required # Feature enablement This feature (optics Interface Link bring-up sequence) would be enabled on per platform basis. There could be cases where vendor(s)/platform(s) may take time to shift from existing codebase to the model (work-flows) described in this document. -## For CMIS/C-CMIS modules: In order to avoid any breakage and ensure gradual migration of different platforms/vendors to this model, will add this new workflow to enable/disable this feature: In order to enable this feature, the platform would set ‘skip_xcvrd_cmis_mgr’ to ‘false’ in their respective pmon_daemon_control.json as part of platform bootstrap. When xcvrd would spawn on that hwsku (LC/board), it would parse ‘skip_xcvrd_cmis_mgr’ and if found 'false', it would launch CMIS task manager. This implies enabling this feature. @@ -165,54 +163,7 @@ Note: This feature flag (skip_xcvrd_cmis_mgr) was added as a flexibility in case Workflow : ![Enabling 'Interface link bring-up sequence' feature(2)](https://user-images.githubusercontent.com/69485234/154403945-654b49d7-e85f-4a7a-bb4d-e60a16b826a7.png) -## For SFF compliant modules: -- SFF task manager (sff_mgr) feature brings the deterministic approach for interface link bring-up to SFF compliant modules. (Refer to [here](#plan) for the reasons of sff_mgr) -- By default, sff_mgr feature is disabled. -- In order to enable sff_mgr feature, the platform would set ‘enable_xcvrd_sff_mgr’ to ‘true’ in their respective pmon_daemon_control.json. Xcvrd would parse ‘enable_xcvrd_sff_mgr’ and if found 'true', it would launch SFF task manager (sff_mgr). -> **_Pre-requisite for enabling sff_mgr:_** -Platform needs to leave the transceiver (if capable of disabling TX) in TX disabled state when an module inserted or during boot-up. This is to make sure the transceiver is not transmitting with TX enabled before host_tx_ready is True. -### Flow of xcvrd main thread spawning sff_mgr thread: -```mermaid -graph TD; -A[wait for PortConfigDone] -B[check if enable_sff_mgr flag exists and is set to true] -C[spawn sff_mgr] -D[proceed to other thread spawning and tasks] - -Start --> A -A --> B -B -- true --> C -C --> D -B -- false --> D -D --> End -``` -### Flow of sff_mgr: -(```tx_disable value/status``` is: ```True``` if TX is disabled; ```False``` if TX is enabled) -```mermaid -graph TD; -A[subscribe to events] -B[while task_stopping_event is not set] -C[check for insertion event and host_tx_ready change event for each intended port] -D[double check if module is present] -E[fetch DB and update host_tx_ready value in local cahce, if not available locally] -F[calculate the target tx_disable value based on host_tx_ready] -G[check if tx_disable status on module is already the target value] -H[go ahead to enable/disable TX based on the target tx_disable value] - -Start --> A -A --> B -B -- true --> C -C -- if either event happened --> E -C -- if neither event happened --> B -E --> D -D -- true --> F -D -- false --> B -F --> G -G -- true --> B -G -- false --> H -H --> B -B -- false --> End -``` + # Transceiver Initialization (at platform bootstrap layer) diff --git a/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md b/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md new file mode 100644 index 00000000000..9ee82b1e191 --- /dev/null +++ b/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md @@ -0,0 +1,180 @@ +# Feature Name +Deterministic Approach for Interface Link bring-up sequence on SFF compliant modules + +# High Level Design Document +#### Rev 0.1 + +# Table of Contents + * [List of Tables](#list-of-tables) + * [Revision](#revision) + * [About This Manual](#about-this-manual) + * [Abbreviation](#abbreviation) + * [References](#references) + * [Problem Definition](#problem-definition) + * [Background](#background) + * [Objective](#objective) + * [Plan](#plan) + * [Breakout handling](#breakout-handling) + * [Feature enablement](#feature-enablement) + * [Pre-requisite](#pre-requisite) + * [Proposed Work-Flows](#proposed-work-flows) + +# List of Tables + * [Table 1: Definitions](#table-1-definitions) + * [Table 2: References](#table-2-references) + +# Revision +| Rev | Date | Author | Change Description | +|:---:|:-----------:|:----------------------------------:|-------------------------------------| +| 0.1 | 07/12/2023 | Longyin Huang | Initial version | + + +# About this Manual +This is a high-level design document describing the need to have determinstic approach on SFF compliant modules for Interface link bring-up sequence and workflows for use-cases around it + +# Abbreviation + +# Table 1: Definitions +| **Term** | **Definition** | +| -------------- | ------------------------------------------------ | +| pmon | Platform Monitoring Service | +| xcvr | Transceiver | +| xcvrd | Transceiver Daemon | +| gbsyncd | Gearbox (External PHY) docker container | + +# References + +# Table 2 References + +| **Document** | **Location** | +|---------------------------------------------------------|---------------| +| Deterministic Approach for Interface Link bring-up sequence for CMIS and SFF modules | [Interface-Link-bring-up-sequence.md](https://github.com/sonic-net/SONiC/blob/master/doc/sfp-cmis/Interface-Link-bring-up-sequence.md) | + + + +# Problem Definition + +1. Presently in SONiC, for SFF compliant modules (100G/40G), there is no synchronization between enabling Tx of optical module and enabling ASIC (NPU/PHY) Tx which may cause link instability during administrative interface enable “config interface startup Ethernet” configuration and bootup scenarios. According to parent [HLD](https://github.com/sonic-net/SONiC/blob/master/doc/sfp-cmis/Interface-Link-bring-up-sequence.md#plan), potential problems are: + - link stability issue which will be difficult to chase in the production network. e.g. If there is a PHY device in between, PHY may adapt to a bad signal or interface flaps may occur when the optics tx/rx enabled during PHY initialization. + - there is a possibility of interface link flaps with non-quiescent optical modules + +2. During administrative interface disable “config interface shutdown Ethernet”, only the ASIC(NPU) Tx is disabled and not the opticcal module Tx/laser. + This will lead to power wastage and un-necessary fan power consumption to keep the module temperature in operating range + +# Background + + Refer to parent [HLD](https://github.com/sonic-net/SONiC/blob/master/doc/sfp-cmis/Interface-Link-bring-up-sequence.md#background) + +# Objective + +According to parent [HLD](https://github.com/sonic-net/SONiC/blob/master/doc/sfp-cmis/Interface-Link-bring-up-sequence.md#objective), have a determistic approach for Interface link bring-up sequence for SFF compliant modules (100G/40G) i.e. below sequence to be followed: + 1. Initialize and enable NPU Tx and Rx path + 2. For system with 'External' PHY: Initialize and enable PHY Tx and Rx on both line and host sides; ensure host side link is up + 3. Then perform optics Tx enable + +# Plan + +Plan is to follow this high-level work-flow sequence to accomplish the Objective: +- Add a new thread SFF task manager (called sff_mgr) inside xcvrd to subscribe to existing field “host_tx_ready” in port table state-DB +- “host_tx_ready” is set to true only when admin_status is true and setting admin_status to syncd/gbsyncd is successful. (As part of setting admin_status to syncd/gbsyncd successfully, the NPU/PHY Tx is enabled/disabled) +- sff_mgr processes the “host_tx_ready” value change event and do optics Tx enable/disable using tx_disable API + +# Breakout Handling + +Refer to parent [HLD](https://github.com/sonic-net/SONiC/blob/master/doc/sfp-cmis/Interface-Link-bring-up-sequence.md#breakout-handling) + +# Feature enablement + + This feature (optics Interface Link bring-up sequence) would be enabled on per platform basis. + There could be cases where vendor(s)/platform(s) may take time to shift from existing codebase to the model (work-flows) described in this document. +- By default, sff_mgr feature is disabled. +- In order to enable sff_mgr feature, the platform would set ‘enable_xcvrd_sff_mgr’ to ‘true’ in their respective pmon_daemon_control.json. Xcvrd would parse ‘enable_xcvrd_sff_mgr’ and if found 'true', it would launch SFF task manager (sff_mgr). + +# Pre-requisite + +In addition to parent HLD's [pre-requisite](https://github.com/sonic-net/SONiC/blob/master/doc/sfp-cmis/Interface-Link-bring-up-sequence.md#pre-requisite), + +> **_Pre-requisite for enabling sff_mgr:_** +Platform needs to leave the transceiver (if capable of disabling Tx) in Tx disabled state when an module inserted or during boot-up. This is to make sure the transceiver is not transmitting with Tx enabled before host_tx_ready is True. + +# Proposed Work-Flows + + - ### Flow of pre-requisite for platform in insertion/bootup cases + ```mermaid + graph TD; + A[platfrom brings module out of RESET] + B[platform keeps module in Tx disabled state immediately after module out-of-RESET] + C[xcvrd detects module insertion via platform API get_transceiver_change_event, and update module status/info to DB] + D[Upon module insertion event, sff_mgr takes action accordingly if needed] + + Start --> A + A --> B + B --> C + C --> D + D --> End + ``` + - ### Feature enablment flow -- how xcvrd spawns sff_mgr thread based on enable_sff_mgr flag + ```mermaid + graph TD; + A[wait for PortConfigDone] + B[check if enable_sff_mgr flag exists and is set to true] + C[spawn sff_mgr] + D[proceed to other thread spawning and tasks] + + Start --> A + A --> B + B -- true --> C + C --> D + B -- false --> D + D --> End + ``` + - ### Flow of calculating target tx_disable value: + - When ```tx_disable value/status``` is ```True```, it means Tx is disabed + - when ```tx_disable value/status``` is ```False```, it means Tx is enabled + ```mermaid + graph TD; + + A[check if both host_tx_ready is True AND admin_status is UP] + B[target tx_disable value is set to False, Tx should be turned ON] + C[target tx_disable value is set to True, Tx should be turned OFF] + + Start --> A + A -- true --> B + A -- false --> C + B --> End + C --> End + ``` + - ### Main flow of sff_mgr, covering below cases: + - system bootup + - transceiver insertion + - admin enable/disable configurations + ```mermaid + graph TD; + A[subscribe to events] + B[while task_stopping_event is not set] + C[check insertion event, host_tx_ready change event and admin_status change event for each intended port] + D[double check if module is present] + E[fetch DB and update host_tx_ready value in local cahce, if not available locally] + E2[fetch DB and update admin_status value in local cahce, if not available locally] + F[calculate target tx_disable value based on host_tx_ready and admin_status] + G[check if tx_disable status on module is already the target value] + H[go ahead to enable/disable Tx based on the target tx_disable value] + + Start --> A + A --> B + B -- true --> C + C -- if either event happened --> E + C -- if neither event happened --> B + E --> E2 + E2 --> D + D -- true --> F + D -- false --> B + F --> G + G -- true --> B + G -- false --> H + H --> B + B -- false --> End + ``` + +# Out of Scope +Refer to parent [HLD](https://github.com/sonic-net/SONiC/blob/master/doc/sfp-cmis/Interface-Link-bring-up-sequence.md) From e42ec6bc6263648e06a52bc947c487b249aa92a7 Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Wed, 19 Jul 2023 18:55:06 -0700 Subject: [PATCH 11/13] Add explaination with parent HLD --- .../Interface-Link-bring-up-sequence-on-sff-modules.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md b/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md index 9ee82b1e191..7cbe0d6d7f8 100644 --- a/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md +++ b/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md @@ -32,6 +32,8 @@ Deterministic Approach for Interface Link bring-up sequence on SFF compliant mod # About this Manual This is a high-level design document describing the need to have determinstic approach on SFF compliant modules for Interface link bring-up sequence and workflows for use-cases around it +Parent HLD [Interface-Link-bring-up-sequence.md](https://github.com/sonic-net/SONiC/blob/master/doc/sfp-cmis/Interface-Link-bring-up-sequence.md) focuses on generic high level background/idea and details for CMIS modules, while this HLD focuses on SFF modules with details. + # Abbreviation # Table 1: Definitions @@ -53,10 +55,9 @@ This is a high-level design document describing the need to have determinstic ap # Problem Definition +According to parent [HLD](https://github.com/sonic-net/SONiC/blob/master/doc/sfp-cmis/Interface-Link-bring-up-sequence.md#plan), as already discussed with sonic-chassis workgroup and OCP community: -1. Presently in SONiC, for SFF compliant modules (100G/40G), there is no synchronization between enabling Tx of optical module and enabling ASIC (NPU/PHY) Tx which may cause link instability during administrative interface enable “config interface startup Ethernet” configuration and bootup scenarios. According to parent [HLD](https://github.com/sonic-net/SONiC/blob/master/doc/sfp-cmis/Interface-Link-bring-up-sequence.md#plan), potential problems are: - - link stability issue which will be difficult to chase in the production network. e.g. If there is a PHY device in between, PHY may adapt to a bad signal or interface flaps may occur when the optics tx/rx enabled during PHY initialization. - - there is a possibility of interface link flaps with non-quiescent optical modules +1. Presently in SONiC, for SFF compliant modules (100G/40G), there is no synchronization between enabling Tx of optical module and enabling ASIC (NPU/PHY) Tx which may cause link instability during administrative interface enable “config interface startup Ethernet” configuration and bootup scenarios. 2. During administrative interface disable “config interface shutdown Ethernet”, only the ASIC(NPU) Tx is disabled and not the opticcal module Tx/laser. This will lead to power wastage and un-necessary fan power consumption to keep the module temperature in operating range From 6e3b29faf634e613b4460ec8d6232386cd06913b Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Fri, 21 Jul 2023 15:51:49 -0700 Subject: [PATCH 12/13] Add lab hazard description --- doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md b/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md index 7cbe0d6d7f8..bd4430c492e 100644 --- a/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md +++ b/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md @@ -60,7 +60,7 @@ According to parent [HLD](https://github.com/sonic-net/SONiC/blob/master/doc/sfp 1. Presently in SONiC, for SFF compliant modules (100G/40G), there is no synchronization between enabling Tx of optical module and enabling ASIC (NPU/PHY) Tx which may cause link instability during administrative interface enable “config interface startup Ethernet” configuration and bootup scenarios. 2. During administrative interface disable “config interface shutdown Ethernet”, only the ASIC(NPU) Tx is disabled and not the opticcal module Tx/laser. - This will lead to power wastage and un-necessary fan power consumption to keep the module temperature in operating range + This will lead to power wastage, un-necessary fan power consumption to keep the module temperature in operating range, and potential lab hazard when the port is shut off but the laser is still on. # Background From dbe5f0f98dc078913ac392721c611027efaaab5b Mon Sep 17 00:00:00 2001 From: Longyin Huang Date: Tue, 5 Dec 2023 11:48:17 -0800 Subject: [PATCH 13/13] Fix typo --- doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md b/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md index bd4430c492e..c28ffe55e2b 100644 --- a/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md +++ b/doc/xrcvd/Interface-Link-bring-up-sequence-on-sff-modules.md @@ -59,7 +59,7 @@ According to parent [HLD](https://github.com/sonic-net/SONiC/blob/master/doc/sfp 1. Presently in SONiC, for SFF compliant modules (100G/40G), there is no synchronization between enabling Tx of optical module and enabling ASIC (NPU/PHY) Tx which may cause link instability during administrative interface enable “config interface startup Ethernet” configuration and bootup scenarios. -2. During administrative interface disable “config interface shutdown Ethernet”, only the ASIC(NPU) Tx is disabled and not the opticcal module Tx/laser. +2. During administrative interface disable “config interface shutdown Ethernet”, only the ASIC(NPU) Tx is disabled and not the optical module Tx/laser. This will lead to power wastage, un-necessary fan power consumption to keep the module temperature in operating range, and potential lab hazard when the port is shut off but the laser is still on. # Background @@ -77,7 +77,7 @@ According to parent [HLD](https://github.com/sonic-net/SONiC/blob/master/doc/sfp Plan is to follow this high-level work-flow sequence to accomplish the Objective: - Add a new thread SFF task manager (called sff_mgr) inside xcvrd to subscribe to existing field “host_tx_ready” in port table state-DB -- “host_tx_ready” is set to true only when admin_status is true and setting admin_status to syncd/gbsyncd is successful. (As part of setting admin_status to syncd/gbsyncd successfully, the NPU/PHY Tx is enabled/disabled) +- “host_tx_ready” is set to true only when admin_status is up and setting admin_status to syncd/gbsyncd is successful. (As part of setting admin_status to syncd/gbsyncd successfully, the NPU/PHY Tx is enabled/disabled) - sff_mgr processes the “host_tx_ready” value change event and do optics Tx enable/disable using tx_disable API # Breakout Handling