Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static Member Selection Modes For ECMP and LAG #2078

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
217 changes: 217 additions & 0 deletions doc/SAI-Proposal-Static-Member-Selection-Modes-For-ECMP-And-LAG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
# [SAI] Static Member Selection Modes for ECMP and LAG
-------------------------------------------------------------------------------
Title | Static Member Selection Modes for ECMP and LAG
-------------|-----------------------------------------------------------------
Authors | Dhruvkumar Kadia, Ravindranath C K (Marvell)
Status | In review
Type | Standards track
Created | 2024-09-19
SAI-Version | 1.16
-------------------------------------------------------------------------------

## 1.0 Introduction

ECMP or Next hop groups (NHG) and LAG use load balancing techniques to spread the traffic across the various members. The most common load balancing technique is to use hash-based member selection. In this technique, flows are identified by performing a hash on a set of packet fields like the 5-tuple, and then selecting the group member based on the computed hash value.
This technique has been traditionally used since it:
1) ensures packet ordering (since packets of a given flow are always sent out on the same member), and
2) is stateless.

However, traditional hash-based load balancing techniques are *static* in the sense that they do not adapt according to the actual state (link failure, port load etc.) of the member. Adaptive Routing and Switching (ARS), is a recent load balancing technique that adapts to the state of the members. ARS maintains state of the member (link status, port load, queue load, path load etc.) to select the ECMP/LAG member that are active and have least load. ARS modes such as SAI_ARS_MODE_FLOWLET_QUALITY also maintain state (idle time and assigned member) per flow in order to ensure packets of a flow are not sent out of order.

There are emerging HPC and AI/ML use-cases where packets of the same flow are sprayed across multiple paths to achieve high bandwidth. In these applications, the packet ordering is done by the higher layers of the network stack at the host nodes; the network does not need to ensure packet order.

This proposal introduces the below two types of light weight *static* member selection modes for ECMP/LAG to address these use-cases:

* Random
* Round Robin

## 2.0 Behavior

### Existing hash based member selection
For every packet hitting the ECMP/LAG, the egress member is selected based on the computed hash value.
![Hash based member selection](figures/Hash_based.png)

### Random member selection
For every packet hitting the ECMP/LAG, the egress member is randomly selected.
![Random member selection](figures/Random.png)

Unlike ARS modes like SAI_ARS_MODE_PER_PACKET_RANDOM, the static member selection modes do not adapt based on the member link status.

### Round-robin member selection
For every packet hitting the ECMP/LAG, the egress member is selected in round robin mode.
![Round-robin member selection](figures/Round_robin.png)

#### Interaction with weighted ECMP
For every packet hitting the ECMP/LAG, the egress member is selected in round robin mode considering the weights of member.
![Round-robin member selection with weighted ecmp](figures/Round_robin_with_weights.png)

## 3.0 SAI Enhancement

A new enum type is defined to denote the static member selection modes.
New attributes are introduced in the NHG and LAG to allow the user to configure the static member selection mode.
Further, an ACL action is introduced to set the static member selection mode.

1) Enum defining the static member selection mode :
```c
/**
* @brief Static member selection mode
*
* Used to select how the next hop group and LAG members are resolved.
* Member can be selected based on hash, random or round robin.
*/
typedef enum _sai_static_member_selection_mode_t
{
/** Static member selection mode based on hash of packet fields */
SAI_STATIC_MEMBER_SELECTION_MODE_HASH,

/** Static member selection mode is random */
SAI_STATIC_MEMBER_SELECTION_MODE_RANDOM,

/** Static member selection mode is round robin */
SAI_STATIC_MEMBER_SELECTION_MODE_ROUND_ROBIN,

} sai_static_member_selection_mode_t;
```
2) Attributes for ECMP and LAG to set one of the mode:
```c
/**
* @brief Attribute id for next hop
*/
typedef enum _sai_next_hop_group_attr_t
{
...
/**
* @brief Nexthop group static member selection mode
* Member can be selected based on hash, random or round robin.
*
* @type sai_static_member_selection_mode_t
* @flags CREATE_ONLY
* @default SAI_STATIC_MEMBER_SELECTION_MODE_HASH
* @validonly SAI_NEXT_HOP_GROUP_ATTR_TYPE == SAI_NEXT_HOP_GROUP_TYPE_DYNAMIC_UNORDERED_ECMP or SAI_NEXT_HOP_GROUP_ATTR_TYPE == SAI_NEXT_HOP_GROUP_TYPE_DYNAMIC_ORDERED_ECMP or SAI_NEXT_HOP_GROUP_ATTR_TYPE == SAI_NEXT_HOP_GROUP_TYPE_FINE_GRAIN_ECMP
*/
SAI_NEXT_HOP_GROUP_ATTR_STATIC_MEMBER_SELECTION_MODE,
...
} sai_next_hop_group_attr_t;
```

```c
/**
* @brief LAG attribute: List of attributes for LAG object
*/
typedef enum _sai_lag_attr_t
{
...
/**
* @brief LAG static member selection mode
* Member can be selected based on hash, random or round robin.
*
* @type sai_static_member_selection_mode_t
* @flags CREATE_ONLY
* @default SAI_STATIC_MEMBER_SELECTION_MODE_HASH
*/
SAI_LAG_ATTR_STATIC_MEMBER_SELECTION_MODE,
...
} sai_lag_attr_t;
```
3) ACL action:
```c
/**
* @brief ACL Action Type
*/
typedef enum _sai_acl_action_type_t
{
...
/** Set ECMP static member selection mode */
SAI_ACL_ACTION_TYPE_SET_ECMP_STATIC_MEMBER_SELECTION_MODE = 0x00000039,

} sai_acl_action_type_t;

/**
* @brief Attribute Id for sai_acl_entry
*
* @flags ranges
*/
typedef enum _sai_acl_entry_attr_t
{
...
/**
* @brief Set ECMP static member selection mode
*
* @type sai_acl_action_data_t sai_static_member_selection_mode_t
* @flags CREATE_AND_SET
* @default disabled
*/
SAI_ACL_ENTRY_ATTR_ACTION_SET_ECMP_STATIC_MEMBER_SELECTION_MODE = SAI_ACL_ENTRY_ATTR_ACTION_START + 0x39,

/**
* @brief End of Rule Actions
*/
SAI_ACL_ENTRY_ATTR_ACTION_END = SAI_ACL_ENTRY_ATTR_ACTION_SET_ECMP_STATIC_MEMBER_SELECTION_MODE,
...
} sai_acl_entry_attr_t;
```

## 4.0 API Example

### Create Next Hop Group

```c
...(Existing Attribute)
sai_attr_list[attr_count].id = SAI_NEXT_HOP_GROUP_ATTR_STATIC_MEMBER_SELECTION_MODE;
sai_attr_list[attr_count++].value.u32 = SAI_STATIC_MEMBER_SELECTION_MODE_RANDOM;

sai_create_next_hop_group_fn(
&nhg_oid,
switch_id,
attr_count,
sai_attr_list);
```

### Create LAG

```c
...(Existing Attribute)
sai_attr_list[attr_count].id = SAI_LAG_ATTR_STATIC_MEMBER_SELECTION_MODE;
sai_attr_list[attr_count++].value.u32 = SAI_STATIC_MEMBER_SELECTION_MODE_ROUND_ROBIN;

sai_create_lag_fn(
&lag_oid,
switch_id,
attr_count,
sai_attr_list);
```
### Create ACL table
```c
count = 0;
sai_attr_list[count].id = SAI_ACL_TABLE_ATTR_ACL_STAGE;
sai_attr_list[count++].value.u32 = SAI_ACL_STAGE_INGRESS;
sai_attr_list[count].id = SAI_ACL_TABLE_ATTR_FIELD_SRC_IP;
sai_attr_list[count++].value.booldata = 1;
action_attr_list[0] = SAI_ACL_ACTION_TYPE_SET_ECMP_STATIC_MEMBER_SELECTION_MODE;
sai_action_attr_list.list = action_attr_list;
sai_action_attr_list.count = 1;
sai_attr_list[count].id = SAI_ACL_TABLE_ATTR_ACL_ACTION_TYPE_LIST;
sai_attr_list[count++].value.s32list = sai_action_attr_list;
attr_count = count;
sai_create_acl_table_fn(
&acl_table_id,
switch_id,
attr_count,
sai_attr_list);
```

### Create ACL entry
```c
count=0;
sai_attr_list[count].id = SAI_ACL_ENTRY_ATTR_TABLE_ID;
sai_attr_list[count++].value.oid = acl_table_id;
sai_attr_list[count].id = SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP;
sai_attr_list[count++].value.aclfield.data.ip4 = 0x0a000002;
sai_attr_list[count].id = SAI_ACL_ENTRY_ATTR_ACTION_SET_ECMP_STATIC_MEMBER_SELECTION_MODE;
sai_attr_list[count++].value.aclfield.data.u32 = SAI_STATIC_MEMBER_SELECTION_MODE_ROUND_ROBIN;
sai_create_acl_entry_fn(
&acl_entry_id,
switch_id,
attr_count,
sai_attr_list);
```
Binary file added doc/figures/Hash_based.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/figures/Random.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/figures/Round_robin.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/figures/Round_robin_with_weights.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
15 changes: 14 additions & 1 deletion inc/saiacl.h
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,10 @@ typedef enum _sai_acl_action_type_t

/** Next Chain Group */
SAI_ACL_ACTION_TYPE_CHAIN_REDIRECT = 0x00000038,

/** Set ECMP static member selection mode */
SAI_ACL_ACTION_TYPE_SET_ECMP_STATIC_MEMBER_SELECTION_MODE = 0x00000039,

} sai_acl_action_type_t;

/**
Expand Down Expand Up @@ -3289,10 +3293,19 @@ typedef enum _sai_acl_entry_attr_t
*/
SAI_ACL_ENTRY_ATTR_ACTION_CHAIN_REDIRECT = SAI_ACL_ENTRY_ATTR_ACTION_START + 0x38,

/**
* @brief Set ECMP static member selection mode
*
* @type sai_acl_action_data_t sai_static_member_selection_mode_t
* @flags CREATE_AND_SET
* @default disabled
*/
SAI_ACL_ENTRY_ATTR_ACTION_SET_ECMP_STATIC_MEMBER_SELECTION_MODE = SAI_ACL_ENTRY_ATTR_ACTION_START + 0x39,

/**
* @brief End of Rule Actions
*/
SAI_ACL_ENTRY_ATTR_ACTION_END = SAI_ACL_ENTRY_ATTR_ACTION_CHAIN_REDIRECT,
SAI_ACL_ENTRY_ATTR_ACTION_END = SAI_ACL_ENTRY_ATTR_ACTION_SET_ECMP_STATIC_MEMBER_SELECTION_MODE,

/**
* @brief End of ACL Entry attributes
Expand Down
10 changes: 10 additions & 0 deletions inc/sailag.h
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,16 @@ typedef enum _sai_lag_attr_t
*/
SAI_LAG_ATTR_ARS_PORT_REASSIGNMENTS,

/**
* @brief LAG static member selection mode
* Member can be selected based on hash, random or round robin.
*
* @type sai_static_member_selection_mode_t
* @flags CREATE_ONLY
* @default SAI_STATIC_MEMBER_SELECTION_MODE_HASH
*/
SAI_LAG_ATTR_STATIC_MEMBER_SELECTION_MODE,

/**
* @brief End of attributes
*/
Expand Down
11 changes: 11 additions & 0 deletions inc/sainexthopgroup.h
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,17 @@ typedef enum _sai_next_hop_group_attr_t
*/
SAI_NEXT_HOP_GROUP_ATTR_NEXT_HOP_MEMBER_COUNTER_LIST,

/**
* @brief Nexthop group static member selection mode
* Member can be selected based on hash, random or round robin.
*
* @type sai_static_member_selection_mode_t
* @flags CREATE_ONLY
* @default SAI_STATIC_MEMBER_SELECTION_MODE_HASH
* @validonly SAI_NEXT_HOP_GROUP_ATTR_TYPE == SAI_NEXT_HOP_GROUP_TYPE_DYNAMIC_UNORDERED_ECMP or SAI_NEXT_HOP_GROUP_ATTR_TYPE == SAI_NEXT_HOP_GROUP_TYPE_DYNAMIC_ORDERED_ECMP
*/
SAI_NEXT_HOP_GROUP_ATTR_STATIC_MEMBER_SELECTION_MODE,

/**
* @brief End of attributes
*/
Expand Down
19 changes: 19 additions & 0 deletions inc/saitypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -1999,6 +1999,25 @@ typedef struct _sai_switch_health_data_t
sai_health_data_t data;
} sai_switch_health_data_t;

/**
* @brief Static member selection mode
*
* Used to select how the Next hop group and LAG members are resolved.
* Member can be selected based on hash, random or round robin.
*/
typedef enum _sai_static_member_selection_mode_t
{
/** Static member selection mode is hash */
SAI_STATIC_MEMBER_SELECTION_MODE_HASH,

/** Static member selection mode is random */
SAI_STATIC_MEMBER_SELECTION_MODE_RANDOM,

/** Static member selection mode is round robin */
SAI_STATIC_MEMBER_SELECTION_MODE_ROUND_ROBIN,

} sai_static_member_selection_mode_t;

/**
* @}
*/
Expand Down
Loading