Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mellanox] Avoid attaching lossless buffer profiles for internal ports #18978

Merged
merged 11 commits into from
Jul 31, 2024

Conversation

vivekrnv
Copy link
Contributor

What I did

  1. Internal ports of smartswitch will not have RDMA traffic and so we need not apply lossless buffer profiles and need not enable PFC. Refer vivekrnv@6accbf5 for a better visualization of the changes in buffer_default_objects.j2

  2. Update the Mellanox-SN4700-O28 SKU buffer profiles to match the following config. Topology inferred from t1-28-lag topology (Add t1-28-lag topology sonic-mgmt#9837) .

Port configuration Value
Breakout mode for each port Defined in port mapping
Speed of the port Defined in Port mapping
Internal Ports Defined in Port mapping
Buffer configuration Value
Shared headroom Enabled
Shared headroom pool factor 2
Dynamic Buffer Disable
In static buffer scenario how many uplinks and downlinks? 8 1x400G uplinks and 20 1x400G downlinks

Port Mapping

Ports Mode
1-8 1x400G
9-28 1x400G
28-32 1x200G (Internal Ports connected to DPU)

Number of Uplinks / Downlinks:

T1 topology:
Length of downlink: 40m
Length of uplink: 300m

Work item tracking
  • Microsoft ADO (number only):

How I did it

How to verify it

  1. Unit Tests

  2. Verify IP traffic between DPU and NPU through internal ports after applying buffer profiles

root@r-leopard-79:/home/admin# show int status
      Interface                            Lanes    Speed    MTU    FEC    Alias             Vlan    Oper    Admin                                             Type    Asym PFC
---------------  -------------------------------  -------  -----  -----  -------  ---------------  ------  -------  -----------------------------------------------  ----------
    Ethernet224  224,225,226,227,228,229,230,231     200G   9100    N/A    etp29           routed      up       up                                DPU-NPU Data Port         off
    Ethernet232  232,233,234,235,236,237,238,239     200G   9100    N/A    etp30           routed      up       up                                DPU-NPU Data Port         off
    Ethernet240  240,241,242,243,244,245,246,247     200G   9100    N/A    etp31           routed      up       up                                DPU-NPU Data Port         off
    Ethernet248  248,249,250,251,252,253,254,255     200G   9100    N/A    etp32           routed      up       up                                DPU-NPU Data Port         off

Run config qos reload and verify Buffer tables:

"BUFFER_PG": {
       ....................
        "Ethernet224|3-4": {
            "profile": "ingress_lossy_profile"
        },
        "Ethernet232|3-4": {
            "profile": "ingress_lossy_profile"
        },
        "Ethernet240|3-4": {
            "profile": "ingress_lossy_profile"
        },
        "Ethernet248|3-4": {
            "profile": "ingress_lossy_profile"
        }
    },

"BUFFER_QUEUE": {
       "Ethernet224|3-4": {
            "profile": "q_lossy_profile"
        },
        "Ethernet232|3-4": {
            "profile": "q_lossy_profile"
        },
        "Ethernet240|3-4": {
            "profile": "q_lossy_profile"
        },
        "Ethernet248|3-4": {
            "profile": "q_lossy_profile"
        },
    },

"BUFFER_PORT_INGRESS_PROFILE_LIST": {
        "Ethernet224": {
            "profile_list": "ingress_lossy_profile"
        },
        "Ethernet232": {
            "profile_list": "ingress_lossy_profile"
        },
        "Ethernet240": {
            "profile_list": "ingress_lossy_profile"
        },
        "Ethernet248": {
            "profile_list": "ingress_lossy_profile"
        }
    },

"BUFFER_PORT_EGRESS_PROFILE_LIST": {
        "Ethernet224": {
            "profile_list": "egress_lossy_profile"
        },
        "Ethernet232": {
            "profile_list": "egress_lossy_profile"
        },
        "Ethernet240": {
            "profile_list": "egress_lossy_profile"
        },
        "Ethernet248": {
            "profile_list": "egress_lossy_profile"
        }
}

"PORT_QOS_MAP": {
       "Ethernet224": {
            "dscp_to_tc_map": "AZURE",
            "pfc_to_pg_map": "AZURE",
            "pfc_to_queue_map": "AZURE",
            "tc_to_pg_map": "AZURE",
            "tc_to_queue_map": "AZURE"
        },
}

QUEUE: {
     "Ethernet224|3": {
            "scheduler": "scheduler.0"
      },
     "Ethernet224|4": {
            "scheduler": "scheduler.0"
     },
}

swss.rec output for traditional mode

2024-04-30.21:51:29.193211|BUFFER_POOL_TABLE:egress_lossless_pool|SET|mode:dynamic|size:60817392|type:egress                                                                                                                           
2024-04-30.21:51:29.193238|BUFFER_POOL_TABLE:egress_lossy_pool|SET|mode:dynamic|size:50270208|type:egress                                                                                                                              
2024-04-30.21:51:29.193258|BUFFER_POOL_TABLE:ingress_lossless_pool|SET|mode:dynamic|size:50270208|type:ingress|xoff:5611520                                                                                                            
2024-04-30.21:51:29.223414|BUFFER_PROFILE_TABLE:pg_lossless_400000_300m_profile|SET|dynamic_th:0|pool:ingress_lossless_pool|size:38912|xoff:358400|xon:38912                                                                           
2024-04-30.21:51:29.223454|BUFFER_PROFILE_TABLE:egress_lossless_profile|SET|dynamic_th:7|pool:egress_lossless_pool|size:0                                                                                                              
2024-04-30.21:51:29.223474|BUFFER_PROFILE_TABLE:pg_lossless_400000_40m_profile|SET|dynamic_th:0|pool:ingress_lossless_pool|size:38912|xoff:144384|xon:38912                                                                            
2024-04-30.21:51:29.223490|BUFFER_PROFILE_TABLE:ingress_lossless_profile|SET|dynamic_th:7|pool:ingress_lossless_pool|size:0                                                                                                            
2024-04-30.21:51:29.223524|BUFFER_PROFILE_TABLE:ingress_lossy_profile|SET|dynamic_th:3|pool:ingress_lossless_pool|size:0                                                                                                               
2024-04-30.21:51:29.223546|BUFFER_PROFILE_TABLE:egress_lossy_profile|SET|dynamic_th:7|pool:egress_lossy_pool|size:9216                                                                                                                 
2024-04-30.21:51:29.223564|BUFFER_PROFILE_TABLE:q_lossy_profile|SET|dynamic_th:3|pool:egress_lossy_pool|size:0 

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

vivekrnv and others added 10 commits April 19, 2024 18:49
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
@vivekrnv vivekrnv requested a review from lguohan as a code owner May 16, 2024 03:25
@vivekrnv vivekrnv requested review from stephenxs and dgsudharsan and removed request for lguohan May 16, 2024 03:25
@vivekrnv
Copy link
Contributor Author

/azpw ms_conflict

1 similar comment
@vivekrnv
Copy link
Contributor Author

/azpw ms_conflict

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this cannot be unified with ACS-MSN2700?

it is hard to maintain a separate file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for buffers_defaults_t0/t1 j2.

hard to maintain a separate file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already maintain a few different buffer_defaults_objects.j2 files for each scenario, like the file under 2700/D48C8 supports shared headroom and extra queues

the one under 2700/ACS-2700 does not support shared headroom.

So, we already maintain more than one file based on use case. Thus i'd prefer not coupling SmartSwitch changes with existing ones. It easy to maintain this way

buffers_defaults_t0/t1 j2. is anyway specific per SKU so it'll be different.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lguohan
The main logic of buffer templates is implemented in buffer_defaults_objects.j2.
we have had 2 different buffer_defaults_objects.j2s. One for single ingress pool mode + shared headroom and the other for double ingress pool without shared headroom pool.
We do it in this way because it will make the template very difficult to understand and maintain if we combine them into one.
Now it's a similar scenario and we have another buffer_defaults_objects.j2 for the smart switch.
Putting them all together we have only 3 buffer_defaults_objects.j2s.

As for the buffer_defaults_t0/t1.j2, they are very simple.
The main logic is to define the pool sizes and to invoke macros to define other buffer objects, like PGs, queues, etc.

So, it doesn't look like a challenge to maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lguohan, let me know what you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lguohan, A gentle reminder

@liat-grozovik
Copy link
Collaborator

@lguohan let me know if you are ok to to merge following Vivek feedback.
moving forward i think we need to have some general files which can be overrides by specific devices. this should be a new feature on master i hope to have in next release

@liat-grozovik liat-grozovik changed the title Avoid attaching lossless buffer profiles for internal ports [Mellanox] Avoid attaching lossless buffer profiles for internal ports May 22, 2024
@liat-grozovik
Copy link
Collaborator

@vivekrnv is this needed for 202311 and should be considered as bug fix?

@vivekrnv
Copy link
Contributor Author

@vivekrnv is this needed for 202311 and should be considered as bug fix?

hi @liat-grozovik , not required for 202311

@lguohan
Copy link
Collaborator

lguohan commented May 24, 2024

@liat-grozovik , i really concerns on duplicating the templates, this makes the maintaining future maintaining efforts very challenging.

@prsunny
Copy link
Contributor

prsunny commented Jul 9, 2024

@lguohan to signoff

"pfcwd_sw_enable" : "3,4"
"pfcwd_sw_enable" : "3,4",
{% endif %}
"tc_to_pg_map" : "AZURE",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tc_to_pg_map and pfc_to_queue_map is also not required for internal ports. Is this excluded?

Copy link
Contributor Author

@vivekrnv vivekrnv Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tc_to_pg_map is required for internal ports.

You mean pfc_to_pg_map and pfc_to_queue_map? Then yes, these two are not required but however having them wouldn't cause any issues IMO. We are configuring the map without enabling pfc on any of the pg's. Let me know if you think i should remove these two as well

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @vivekrnv. It's harmless to keep these mappings the same between internal ports and normal ports but the logic will be much simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vivekrnv @stephenxs lossless buffer profiles are not attached to internal ports, so lossless functionality is not possible on these internal ports.

If it is to keep the config uniform across all the ports, this is good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea is to keep the maps uniform across all the ports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kperumalbfn, Can we sign this off?

@dgsudharsan dgsudharsan requested a review from kperumalbfn July 30, 2024 22:43
@kperumalbfn
Copy link
Contributor

LGTM

@kperumalbfn kperumalbfn merged commit 4e78b11 into sonic-net:master Jul 31, 2024
19 checks passed
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Aug 2, 2024
sonic-net#18978)

[Mellanox] Avoid attaching lossless buffer profiles for internal ports (sonic-net#18978)

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202405: #19768

mssonicbld pushed a commit that referenced this pull request Aug 2, 2024
#18978)

[Mellanox] Avoid attaching lossless buffer profiles for internal ports (#18978)

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants