-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add write socket buffer to fix CoPP rx performance issue #1092
Conversation
Add write socker buffer to fix CoPP rx performance issue - What I did Add write socker buffer to fix packet drop issue in ptf_nn_agent.py - How I did it Add set command in the file build_debian.sh - How to verify it Run the CoPP tests in the testbed - Description for the changelog Add write socker buffer to fix CoPP rx performance issue
More detailed discussion: |
Hi Jason, Thank you for your PR. I'd like to understand the reason of increasing the write buffer. I increased the read buffer previously because ingress packets comes with the ASIC line rate which is impossible to reach by python read. So I increased socket memory buffer to ask Linux kernel preserve unread packets for ptf_nn_agent. In your case contrary increasing write buffer means that your application writes packets with the rate higher than your NIC/ASIC can reach. But in your case a writer is a python and consumer is Linux kernel with ASIC. But as I understand python is much slower than Linux kernel + ASIC. Can you please explain how your change can help you? I think increasing read buffer is enough. Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't request any change. I'd like to understand why the changes were offered.
Hi, Pavel: Thanks for your review. In the CoPP tests, the ptf_nn_agent will send packets back from DUT to test server to count matched packets and insufficient write socket buffer will cause ingress packet drop on DUT. If the ptf_nn_agent does not send packets back from DUT to test server, I think increasing read socket buffer on DUT is enough. However this action is required by ptf_nn_agent on the test server. When sending packet back to testbed server, I think the bottleneck is the between python and linux kernel, not on the ASIC. For example, the CoPP test for DHCP is to send 100000 DHCP packets from test server to DUT without CoPP policy applied and expected low packet loss rate (<10%) on DUT. Before increasing write socket buffer, the CoPP test for DHCP is failed due to more packet loss than expected.
After increasing socket buffer, the CoPP test for DHCP is passed.
Below is how we increase write buffer on Linux kernel and python. To add write buffer on Linux kernel, we add the following line in the file /etc/sysctl.conf on DUT
To add write buffer on Python, we add the options below in the ptf_nn_agent command on DUT
As we tested the CoPP test will pass only when write buffer on Linux kernel and Python are both increased. |
Honestly I don't understand how Linux kernel could be slower then python? |
Fix this issue by increasing read socket buffer instead of write buffer:
See more discussion: sonic-mgmt#308 |
Sonic-swss-common: aaa8133 - 2019-10-12 : Add VRF object table in state_db (#312) [Tyler Li] 91aceb1 - 2019-10-11 : [schema] Update schema to support debug counters (#308) [Danny Allen] 9bcd5ca - 2019-09-28 : [multi-DB] fix vs test, should NOT replace old DBConnector API with new DBConnector API since vs test docker has no database_config.josn (#311) [Dong Zhang] 599155a - 2019-09-25 : [multi-DB] Part 2: C++ interface API changes / swsscommon unit test / LOGLEVEL_DB apply new API (#301) [Dong Zhang] 379ac73 - 2019-09-20 : add bulkremove for consumer_table_pops.lua (#306) [Dong Zhang] 6b805d3 - 2019-09-19 : timerfd return 0 with errno =0 - handle as False alarm. (#302) [Renuka Manavalan] e455891 - 2019-09-03 : Add VLAN_SUB_INTERFACE in CONFIG_DB schema (#284) [Wenda Ni] Sonic-swss 731a8f5 - 2019-10-17 : [copporch]: fix the endless loop problem when removing copp table group. (#1038) [wangshengjun] 1623219 - 2019-10-14 : Enable C++ unit test during build (#1092) [Qi Luo] 629c9d3 - 2019-10-14 : [vstest]: Revert back to 2 sec, and check if we got more than expected number of syslogs (#1091) [Prince Sunny] 80b2ace - 2019-10-11 : sonic-swss/orchagent: Add new protocol trap name support (#1087) [jpxjlrldgit] 9f765f7 - 2019-10-11 : [aclorch]: Check for existing mirror table only when creating a new table (#1089) [Danny Allen] 4c10260 - 2019-10-11 : [vstest]: Update Route test to check for added entry (#1088) [Prince Sunny] e658b64 - 2019-10-11 : [chassisorch]: Add everflow feature for chassis (#1024) [Ze Gan] 5b13387 - 2019-10-10 : [changelog]: Revert changelog that was done for passing VS test. (#1080) [Prince Sunny] 90a690d - 2019-10-10 : [aclorch]: Simplify the TCP flags matching code and support exact value match (#1072) [Shuotian Cheng] 3461710 - 2019-10-09 : Single VRF for ingress and egress flows, skip route replication (#1045) [Prince Sunny] 953474a - 2019-10-03 : [swss]: Do not use namespace in header files (#1081) [Wenda Ni] bd36751 - 2019-10-03 : Change nexthop key to ip & ifname (#977) [tylerlinp] fee1aaa - 2019-10-02 : [teamsyncd]: Check if LAG exists before removing (#1069) [Shuotian Cheng] 175f3de - 2019-09-30 : Update ECMP NHopGroup for Port Channel oper down (#1030) [Sumukha Tumkur Vani] 182940d - 2019-09-26 : [mirrororch]: Remove mirror session state after it is remvoed (#1066) [Shuotian Cheng] d823dd1 - 2019-09-20 : [MirrorOrch]: Mirror Session Retention across Warm Reboot (#1054) [Shuotian Cheng] a5b6e7c - 2019-09-19 : Ignore link local neighbors (#1065) [Prince Sunny] 0ddaba3 - 2019-09-19 : Adopt to signature change of Selectable::readData, which switched (#1061) [Renuka Manavalan] 543bd98 - 2019-09-18 : [aclorch]: Fix table name in counter table for mirror rules (#1060) [Shuotian Cheng] 12c29b4 - 2019-09-19 : Cannot ping to link-local ipv6 interface address of the switch. (#774) [Kiran Kumar Kella] 4d8e08d - 2019-09-18 : change in fpmsyncd to skip the lookup for the Master device name if the route object table value is zero (#1048) [Arvindsrinivasan Lakshmi narasimhan] da514f5 - 2019-09-18 : Do not update lag mtu from teamsyncd (netlink) (#1053) [Prince Sunny] 3fb22e1 - 2019-09-16 : Check warmboot flag during initialization (#1057) [Prince Sunny] d98d1e9 - 2019-09-16 : [aclorch]: Egress mirror action support and action ASIC support check (#963) [Stepan Blyshchak] 313ef5c - 2019-09-09 : Warmboot Vlan neigh restore fix (#1040) [Prince Sunny] 5841e06 - 2019-09-06 : Add dot1p to tc mapping support (#871) [Wenda Ni] 39fe568 - 2019-08-30 : [aclorch]: Revise ACL rule creation/removal logs (#1042) [Shuotian Cheng] c461911 - 2019-08-27 : [copporch]: Fix the typo - mld_v1_done (#1037) [wangshengjun] 34915de - 2019-08-22 : [portsyncd]: Add default catch block in portsyncd (#1033) [SuvarnaMeenakshi] dc81a21 - 2019-08-20 : [vnet]: Fix FDB related failure in "vnet_bitmap" virtual switch test (#1034) [Volodymyr Samotiy] 5ae4226 - 2019-08-19 : [test]: Adjust stale timer for warm-reboot neighborsync test cases (#1031) [zhenggen-xu] 65cbd55 - 2019-08-16 : [build]: Fix compiling warnings using ARM 32 bit compiler (#1015) [arheneus@marvell.com] b611808 - 2019-08-16 : [Orchagent]: Fixbug segmentfault at routeorch (#1025) [Ze Gan]
* Enable test during build * Exclude `tests` in the deb package
- Update SAI VoQ support (sonic-net#1107) … - Voq system (sonic-net#1081) … - [meta] Add support for ignored enum values (sonic-net#1099) - TPID SAI proposal (sonic-net#1089) … - ACL GRE key match (sonic-net#1076) … - Add IPv6 NS and NA Traps (sonic-net#1092) … - MACsec flow list attribute added in MACsec object (sonic-net#1095) … - Add Enterprise Number for IPFIX Report Type (sonic-net#1072) … - Provide TTL and QoS treatment during MPLS encap and decap (sonic-net#1079) - Create and Set for Tunnel Attributes (sonic-net#1086) …
Add write socket buffer to fix CoPP rx performance issue
- What I did
Add write socket buffer to fix packet drop issue in ptf_nn_agent.py
- How I did it
Add set command in the file build_debian.sh
- How to verify it
Run the CoPP tests in the testbed and verify the test results are passed
- Description for the changelog
Add write socket buffer to fix CoPP rx performance issue