Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEC auto determination #1490

Open
wants to merge 31 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
896ee4f
Add auto-fec.md
longhuan-cisco Aug 28, 2023
c19407c
Update doc and location
longhuan-cisco Aug 30, 2023
bc66360
Update auto-fec.md
longhuan-cisco Aug 30, 2023
d7736ca
Update auto-fec.md
longhuan-cisco Aug 30, 2023
533a2f2
Merge branch 'sonic-net:master' into auto_fec
longhuan-cisco Aug 30, 2023
da0afb2
Update doc
longhuan-cisco Sep 1, 2023
e66d759
Update doc
longhuan-cisco Sep 1, 2023
435f37e
Update doc
longhuan-cisco Sep 1, 2023
affcb54
Update doc
longhuan-cisco Sep 1, 2023
c7e586d
Update doc
longhuan-cisco Sep 1, 2023
b302100
Update doc
longhuan-cisco Sep 1, 2023
aaf9f12
Update doc
longhuan-cisco Sep 1, 2023
68afb3f
Update/rename doc
longhuan-cisco Sep 6, 2023
d950f5d
Update doc
longhuan-cisco Sep 6, 2023
981cc55
Update doc
longhuan-cisco Sep 6, 2023
b6c47ca
Update doc
longhuan-cisco Sep 6, 2023
38a0123
Update doc
longhuan-cisco Sep 6, 2023
b6e1b64
Update doc
longhuan-cisco Sep 7, 2023
67cbd2a
Update doc
longhuan-cisco Sep 7, 2023
6a9a5e7
Update doc
longhuan-cisco Sep 8, 2023
1c0915d
Updated determine-fec.md section "Difference between other design"
shyam77git Sep 10, 2023
2c305a3
Merge pull request #2 from shyam77git/patch-4
longhuan-cisco Sep 11, 2023
8d743ca
Update doc and rename file
longhuan-cisco Oct 3, 2023
810ee7f
Update doc
longhuan-cisco Oct 3, 2023
aef6a01
Update doc
longhuan-cisco Oct 3, 2023
ddd2a28
Update doc
longhuan-cisco Oct 3, 2023
9c5e3c3
Update doc
longhuan-cisco Oct 3, 2023
06f2cbe
Update doc
longhuan-cisco Oct 3, 2023
7835676
Remove difference-to-other section
longhuan-cisco Oct 4, 2023
35b5585
Merge branch 'sonic-net:master' into fec_auto_determine
longhuan-cisco Oct 4, 2023
d636dd2
Update wording and FEC mapping
longhuan-cisco Oct 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 187 additions & 0 deletions doc/port-fec/fec-auto-determination.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# SONiC FEC Auto Determination Design #

## Table of Content

- [Revision](#revision)
- [Scope](#scope)
- [Definitions/Abbreviations](#definitions/abbreviations)
- [Overview](#overview)
- [High-Level Design](#high-level-design)
- [API design](#api-design)
- [Common Rule for FEC Determination](#common-rule-for-fec-determination)
- [Table 1: FEC Mapping Based on Optics Type](#table-1-fec-mapping-based-on-optics-type)
- [Table 2: FEC Mapping Based on Lane Speed and Number of Lanes](#table-2-fec-mapping-based-on-lane-speed-and-number-of-lanes)
- [Diagram For Different Use Cases](#diagram-for-different-use-cases)
- [Dependency](#dependency)

### Revision

| Rev | Date | Author | Change Description |
|:---:|:-----------:|:-------------------:|--------------------------------------------|
| 0.1 | | Shyam Kumar, Longyin Huang | Initial version |

### Scope
This document is the design document for FEC auto-determination feature on SONiC.

### Definitions/Abbreviations
| **Term** | **Definition** |
| -------------- | ------------------------------------------------ |
| FEC | Forward Error Correction |
| DPB | Dynamic Port Breakout |

### Overview

FEC mode is a critical configuration for a port, which needs to be configured properly for the port to come up.

There are below scenarios that can end up with wrong FEC mode:
1. In DPB(Dynamic Port Breakout) case, today's DPB CLI doesn't generate FEC config for newly created ports, FEC mode is default to `none` at [SAI/SDK](https://github.com/opencomputeproject/SAI/blob/a94bbbe43242a4d9e1a4d9f70780ea9251127f5d/inc/saiport.h#L1012) layer.
2. In non-DPB case,
- Some platforms have no FEC configured in CONFIG_DB by default. The FEC mode can be either default to `none` at SAI/SDK layer or manually configured by user who might not have enough domain knowledge.
- Some platforms have default FEC defined in `port_config.ini`, which however might not be suitable for the specific port/optics on the system.

The feature in this document is to address the issue in both of above scenarios in a common platform-independent way, since the rule to determine FEC for a given port/optics is common for all platforms.

### High-Level Design

Add `determine-fec` module which can determine FEC mode based on common rule for a given port with a given optics. This module provides a `determine_fec` API which can be invoked in below use cases:
1. DPB use case: Enhance today's DPB CLI to automatically determine and configure FEC in CONFIG_DB for dynamically created ports, based on determine-fec module.
2. non-DPB use case: Add a user-triggered CLI `fec-auto-correct` to automatically determine and configure FEC in CONFIG_DB for existing ports, based on determine-fec module.
- Future plan: determine-fec module can be further enhanced to be integrated with xcvrd, which can be triggered automatically during transceiver insertion, without human intervention. (details TBD)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the HLD as discussed in the community to capture the flow where FEC setting is generated by xcvrd based on media type and pushed to application DB.


### API design
```
def determine_fec(lane_speed: int, num_lanes: int, optics_type: Optional[str] = None) -> str:
"""
Determines the appropriate Forward Error Correction (FEC) type based on lane speed, number of lanes, and optics type for a specific logical port.
This logic is based on FEC mapping rules common for all platforms.

Parameters:
- lane_speed (int): The speed of each lane in GB.
- num_lanes (int): The total number of lanes.
- optics_type (Optional[str]): The type of optics in use. Can be None if not applicable.

Returns:
- str: The recommended FEC type based on the common rules. It can be either 'none'/'rs'/'fc'.

Example:
>>> determine_fec(25, 4, "100G-SR4")
"rs"

"""
```

### Common Rule for FEC Determination

#### Table 1: FEC Mapping Based on Optics Type
| Optics Type | FEC |
|-------------|------|
| 40G | none |
| 100G-DR | none |
| 100G-FR | none |
| 100G-LR | none |
| 100G-LR4 | none |
| 100G-ER4 | none |
| 100G AOC | none |
| 400G | rs |
| ALL_OTHER | rs |

#### Table 2: FEC Mapping Based on Lane Speed and Number of Lanes
| Lane Speed | Number of Lanes (per Logical Port) | FEC |
|------------|-----------------------------------|------|
| 10 | 1 | none |
| 10 | 4 | none |
| 20 | 2 | none |
| 25 | 1 | rs |
| 25 | 2 | rs |
| 25 | 4 | rs |
| 25 | 8 | rs |
| 50 | 1 | rs |
| 50 | 2 | rs |
| 50 | 4 | rs |
| 50 | 8 | rs |
| 50 | 16 | rs |

Above tables can be defined as JSON file, and be loaded by determine-fec module. Platform can also override the default FEC mapping by providing its own JSON file.

> [!NOTE]
> If a port has matched FEC entry in both above tables, then prefers FEC entry in first table. (Only exception: For `lane_speed=10, num_lane=1 or 4`, we prefer FEC entry in second table)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First table defines the Optical media interface only. there can be on a single optical interface different types of electrical interface which define different FEC types.
for example: 100GBASE-DR (Clause 140) defines optical interface of 1 fiber lane.
Electrical interfaces that can be are (according to IEEE clause 140)
CAUI-4 (100G on 4 electrical lanes) - NO-FEC
100GAUI-2 (100G on 2 electrical lanes) - RS-FEC
100GAUI-1 (100G on 1 electrical lane) - RS FEC

This does not align with the statement in line 107 that " If a port has matched FEC entry in both above tables, then prefers FEC entry in first table".
FEC should be decided based on the Electrical interface (Second table) and not the first table in optical cable cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example: 100GBASE-DR (Clause 140) defines optical interface of 1 fiber lane. Electrical interfaces that can be are (according to IEEE clause 140) CAUI-4 (100G on 4 electrical lanes) - NO-FEC 100GAUI-2 (100G on 2 electrical lanes) - RS-FEC 100GAUI-1 (100G on 1 electrical lane) - RS FEC

Just to confirm, for host side fec, were you saying we need to have the following mappings and it's standard?

optics_type=100G-DR, lane_speed=25, num_lanes=4, fec=none
optics_type=100G-DR, lane_speed=50, num_lanes=2, fec=rs
optics_type=100G-DR, lane_speed=100, num_lanes=1, fec=rs

FEC should be decided based on the Electrical interface (Second table) and not the first table in optical cable cases

I think we still need to rely on both optics_type(1st table) and electrical interface (2nd table).
For example, for the case of (lane_speed=25, num_lanes=4), fec can be either RS (for 100G CWDM4/100G PSM4/100G-CR/etc) or
none (for 100G-DR/100G-FR/etc)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, there are some cases where on same lane_speed you can have different FEC results.
so it should based on inputs from both tables, but not just one of them



### Diagram For Different Use Cases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the diagrams capturing the flow from xcvrd.
For the DPB use case, since xcvrd today listens to port breakout and generates settings, it would be better for xcvrd to use the same flow as during initialization to push the FEC setting for newly added ports in case of DPB


```mermaid
sequenceDiagram
title FEC determination in DPB case

actor user as User
participant dpb_cli as DPB CLI
participant determine_fec as determine-fec module
participant config_db as CONFIG_DB
participant syncd as SYNCD


user ->>+ dpb_cli: do breakout on a port
note over dpb_cli: run today's flow per port:
dpb_cli ->>+ dpb_cli: generate the config(speed/lanes/etc) for each new port

note over dpb_cli,determine_fec: run below additional logic per port:
dpb_cli ->>+ determine_fec: call API determine_fec(lane_speed, num_lanes) per port
determine_fec ->>+ determine_fec: calculate FEC mode based on common rule
determine_fec -->>- dpb_cli: return FEC mode

note over dpb_cli, syncd: run today's flow:
dpb_cli ->>+ config_db: Add new ports in PORT table (today's flow), <BR> additionally with proper FEC if determined above

par
config_db -->>- dpb_cli: Done
and
config_db ->>+ syncd: notify for new port creation
end

dpb_cli -->>- user: Done
```

```mermaid
sequenceDiagram
title FEC determination in non-DPB case

actor user as User
participant fec_correct_cli as fec-auto-correct CLI <BR> (just a wrapper)
participant determine_fec as determine-fec module
participant state_db as STATE_DB
participant config_db as CONFIG_DB
participant syncd as SYNCD

user ->>+ fec_correct_cli: auto-correct FEC mode for all ports
fec_correct_cli ->>+ determine_fec: call API correct_fec_for_all_ports()

loop every port
determine_fec ->>+ state_db: read optics_type from TRANSCEIVER_INFO table
state_db -->>- determine_fec: return optics_type
determine_fec ->>+ determine_fec: internally call API determine_fec(lane_speed, num_lanes, optics_type) <BR> which calculates FEC mode based on common rule
end

determine_fec ->>+ config_db: update FEC if needed

par
config_db -->>- determine_fec: Done
and
config_db ->>+ syncd: notify for FEC update
end

determine_fec -->>- fec_correct_cli: Done
fec_correct_cli -->>- user: Done

```

> [!NOTE]
> In the above use cases, automatically determined FEC mode is saved in running config, user needs to do ```config save``` to save it permanently.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the community review, the FEC settings should be added to app_db and not config_db. In addition there should be a global configuration knob to allow xcvrd auto setting FEC. This knob should be disabled by default to honor backward compatibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dgsudharsan
That's correct. We would be adding these two updates to the HLD

Also, we would update this HLD with following as well:

  • Remove CLI mention from the HLD and directly mention regular use-case i.e. xcvrd is the consumer of new FEC determination API (mentioned in this HLD)
  • Anytime, there is FEC config in config_db, it would supersede any existing or upcoming FEC setting.
    For this HLD it implies: Apply FEC determination API only if there is no FEC entry/setting in config_db.


### Dependency
A new ```optics_type``` field (human-readable type for optics, such as ```100G-DR```, ```100G-FR```, etc) will be added to TRANSCEIVER_INFO table, so that determine-fec module can read it for the non-DPB use case.

To implement this, ```optics_type``` can be determined based on today's transceiver_info, and be added as part of output of API [get_transceiver_info()](https://github.com/sonic-net/sonic-platform-common/blob/1988b37c7668394f38f155c86f5462a4461fe82e/sonic_platform_base/sonic_xcvr/api/xcvr_api.py#L42-L71) in ```sonic-platform-common``` repo.

```optics_type``` field can also provide benefits in readability/service-ability/debug-ability:
1. help user/engineer to easily and quickly identify what optics are plugged onto the router (if it can be added to show CLI output later)
2. test script can easily figure out the optics type based on this single ```optics_type``` field and do test actions accordingly.