Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine Grained ECMP doesn't distribute flows according to configured banks #5648

Closed
nazariig opened this issue Oct 16, 2020 · 4 comments
Closed

Comments

@nazariig
Copy link
Collaborator

nazariig commented Oct 16, 2020

Description

Fine Grained ECMP doesn't distribute flows according to configured banks.
The behavior differs to what is described by HLD.

Steps to reproduce the issue:

  1. Deploy PTF32 topo
  2. Set IP addr for PTF container
root@c68235932554:~# ifconfig eth0 10.0.0.1 netmask 255.255.255.254
root@c68235932554:~# ifconfig eth1 10.0.0.3 netmask 255.255.255.254
root@c68235932554:~# ifconfig eth2 10.0.0.5 netmask 255.255.255.254
root@c68235932554:~# ifconfig eth3 10.0.0.7 netmask 255.255.255.254
root@c68235932554:~# ifconfig eth4 10.0.0.9 netmask 255.255.255.254
root@c68235932554:~# ifconfig eth5 10.0.0.11 netmask 255.255.255.254
  1. Configure ECMP
root@sonic:/home/admin# vtysh -c 'configure terminal' -c 'ip route 10.10.10.10/32 10.0.0.1'
root@sonic:/home/admin# vtysh -c 'configure terminal' -c 'ip route 10.10.10.10/32 10.0.0.3'
root@sonic:/home/admin# vtysh -c 'configure terminal' -c 'ip route 10.10.10.10/32 10.0.0.5'
root@sonic:/home/admin# vtysh -c 'configure terminal' -c 'ip route 10.10.10.10/32 10.0.0.7'
root@sonic:/home/admin# vtysh -c 'configure terminal' -c 'ip route 10.10.10.10/32 10.0.0.9'
root@sonic:/home/admin# vtysh -c 'configure terminal' -c 'ip route 10.10.10.10/32 10.0.0.11'
  1. Configure Fine Grained ECMP
root@sonic:/home/admin# cat fg_ecmp.json
{
    "FG_NHG": {
        "2-VM-Sets": {
            "bucket_size": "12"
        }
    },
    "FG_NHG_PREFIX": {
        "10.10.10.10/32": {
            "FG_NHG": "2-VM-Sets"
        }
    },
    "FG_NHG_MEMBER": {
        "10.0.0.1": {
                "FG_NHG": "2-VM-Sets",
                "Bank": "0"
        },
        "10.0.0.3": {
                "FG_NHG": "2-VM-Sets",
                "Bank": "0"
        },
        "10.0.0.5": {
                "FG_NHG": "2-VM-Sets",
                "Bank": "1"
        },
        "10.0.0.7": {
                "FG_NHG": "2-VM-Sets",
                "Bank": "1"
        },
        "10.0.0.9": {
                "FG_NHG": "2-VM-Sets",
                "Bank": "2"
        },
        "10.0.0.11": {
                "FG_NHG": "2-VM-Sets",
                "Bank": "2"
        }
    }
}
root@sonic:/home/admin# sonic-cfggen -j fg_ecmp.json -w
  1. Send traffic
>>> sendp(Ether(dst="b8:59:9f:a6:28:00")/IP(dst="10.10.10.10",src="192.168.0.1")/UDP(dport=1,sport=1)/Raw(load="abc"),iface='eth6',count=100)
....................................................................................................
Sent 100 packets.
>>> sendp(Ether(dst="b8:59:9f:a6:28:00")/IP(dst="10.10.10.10",src="192.168.0.2")/UDP(dport=1,sport=1)/Raw(load="abc"),iface='eth6',count=100)
....................................................................................................
Sent 100 packets.
>>> sendp(Ether(dst="b8:59:9f:a6:28:00")/IP(dst="10.10.10.10",src="192.168.0.2")/UDP(dport=1,sport=1)/Raw(load="abc"),iface='eth6',count=100)
....................................................................................................
Sent 100 packets.
  1. Disable port Ethernet16
root@sonic:/home/admin# config interface shutdown Ethernet16
  1. Observe invalid flows distribution
root@sonic:/home/admin# show interfaces counters
Last cached time was 2020-10-16 17:12:09.220384
      IFACE    STATE    RX_OK      RX_BPS    RX_UTIL    RX_ERR    RX_DRP    RX_OVR    TX_OK      TX_BPS    TX_UTIL    TX_ERR    TX_DRP    TX_OVR
-----------  -------  -------  ----------  ---------  --------  --------  --------  -------  ----------  ---------  --------  --------  --------
  Ethernet0        U        0    0.00 B/s      0.00%         0         0         0      100  640.70 B/s      0.00%         0         0         0
  Ethernet4        U        0    0.00 B/s      0.00%         0         0         0        0    0.00 B/s      0.00%         0         0         0
  Ethernet8        U        0    0.00 B/s      0.00%         0         0         0        0    0.00 B/s      0.00%         0         0         0
 Ethernet12        U        0    0.00 B/s      0.00%         0         0         0        0    0.00 B/s      0.00%         0         0         0
 Ethernet16        U        0    0.00 B/s      0.00%         0         0         0        1   22.52 B/s      0.00%         0         0         0
 Ethernet20        U        0    0.00 B/s      0.00%         0         0         0        1   22.52 B/s      0.00%         0         0         0
 Ethernet24        U      100  640.70 B/s      0.00%         0         0         0        0    0.00 B/s      0.00%         0         0         0

root@sonic:/home/admin# show interfaces counters
Last cached time was 2020-10-16 17:12:09.220384
      IFACE    STATE    RX_OK      RX_BPS    RX_UTIL    RX_ERR    RX_DRP    RX_OVR    TX_OK      TX_BPS    TX_UTIL    TX_ERR    TX_DRP    TX_OVR
-----------  -------  -------  ----------  ---------  --------  --------  --------  -------  ----------  ---------  --------  --------  --------
  Ethernet0        U        0    0.00 B/s      0.00%         0         0         0      100  261.93 B/s      0.00%         0         0         0
  Ethernet4        U        0    0.00 B/s      0.00%         0         0         0        0    0.00 B/s      0.00%         0         0         0
  Ethernet8        U        0    0.00 B/s      0.00%         0         0         0        0    0.00 B/s      0.00%         0         0         0
 Ethernet12        U        0    0.00 B/s      0.00%         0         0         0        0    0.00 B/s      0.00%         0         0         0
 Ethernet16        U        0    0.00 B/s      0.00%         0         0         0      101  271.14 B/s      0.00%         0         0         0
 Ethernet20        U        0    0.00 B/s      0.00%         0         0         0        1    9.21 B/s      0.00%         0         0         0
 Ethernet24        U      200  523.86 B/s      0.00%         0         0         0        0    0.00 B/s      0.00%         0         0         0

root@sonic:/home/admin# show interfaces counters
Last cached time was 2020-10-16 17:12:09.220384
      IFACE    STATE    RX_OK      RX_BPS    RX_UTIL    RX_ERR    RX_DRP    RX_OVR    TX_OK      TX_BPS    TX_UTIL    TX_ERR    TX_DRP    TX_OVR
-----------  -------  -------  ----------  ---------  --------  --------  --------  -------  ----------  ---------  --------  --------  --------
  Ethernet0        U        0    0.00 B/s      0.00%         0         0         0      101  148.63 B/s      0.00%         0         0         0
  Ethernet4        U        0    0.00 B/s      0.00%         0         0         0        1    5.05 B/s      0.00%         0         0         0
  Ethernet8        U        0    0.00 B/s      0.00%         0         0         0        1    5.05 B/s      0.00%         0         0         0
 Ethernet12        U        0    0.00 B/s      0.00%         0         0         0      101  148.63 B/s      0.00%         0         0         0
 Ethernet16        X        0    0.00 B/s      0.00%         0         0         0      102  153.68 B/s      0.00%         0         0         0
 Ethernet20        U        0    0.00 B/s      0.00%         0         0         0        2   10.10 B/s      0.00%         0         0         0
 Ethernet24        U      300  430.75 B/s      0.00%         0         0         0        1    5.05 B/s      0.00%         0         0         0

Note: Flow is expected at egress port Ethernet20 since Ethernet16 is disabled

Describe the results you received:
According to HLD (https://github.com/Azure/SONiC/blob/master/doc/ecmp/fine_grained_next_hop_hld.md):

Config DB:

FG_NHG|{{fg-nhg-group-name}}:
    "bucket_size": {{hash_bucket_size}}

FG_NHG_PREFIX|{{IPv4 OR IPv6 prefix}}:
    "FG_NHG":{{fg-nhg-group-name}}

FG_NHG_MEMBER|{{next-hop-ip(IPv4 or IPv6 address)}}:
    "FG_NHG":{{fg-nhg-group-name}}
    "Bank": {{an index which specifies a bank/group in which the redistribution is performed}} 

; Defines schema for FG NHG configuration attributes
key                                   = FG_NHG|fg-nhg-group-name      ; FG_NHG group name
; field                               = value
BUCKET_SIZE                           = hash_bucket_size              ; total hash bucket size desired, recommended value of Lowest Common Multiple of 1..{max # of next-hops}

; Defines schema for FG NHG prefix configuration attributes
key                                   = FG_NHG_PREFIX|{{IPv4 OR IPv6 prefix}} ; FG_NHG_PREFIX for which FG behavior is desired
; field                               = value
FG_NHG                                = fg-nhg-group-name                     ; Fine Grained next-hop group name

; Defines schema for FG NHG member configuration attributes
key                                   = FG_NHG_MEMBER|{{next-hop-ip(IPv4 or IPv6 address)}}    ; FG_NHG next-hop-ip member associated with prefix
; field                               = value
FG_NHG                                = fg-nhg-group-name                                      ; Fine Grained next-hop group name
BANK                                  = DIGITS                                                 ; An index which specifies a bank/group in which the redistribution is performed		  

State DB:

FG_ROUTE_TABLE|{{IPv4 OR IPv6 prefix}}:
    "0": {{next-hop-key}}
    "1": {{next-hop-key}}
    ...
    "{{hash_bucket_size -1}}": {{next-hop-key}}

; Defines schema for FG ROUTE TABLE state db attributes
key                                   = FG_ROUTE_TABLE|{{IPv4 OR IPv6 prefix}}      ; Prefix associated with this route
; field                               = value
INDEX                                 = next-hop-key                                ; index in hash bucket associated with the next-hop-key(IP addr,if alias) 

In fact we have

Oct 16 16:19:09.111551 sonic INFO swss#orchagent: :- doTaskFgNhg: Added new FG_NHG entry with configured_bucket_size 12
Oct 16 16:19:09.124397 sonic INFO swss#orchagent: :- calculateBankHashBucketStartIndices: Calculate_bank_hash_bucket_start_indices: bank 0, si 0, ei 127

and

root@sonic:/home/admin# redis-cli -n 4 KEYS '*' | grep FG
FG_NHG|2-VM-Sets
FG_NHG_MEMBER|10.0.0.1
FG_NHG_PREFIX|10.10.10.10/32
FG_NHG_MEMBER|10.0.0.9
FG_NHG_MEMBER|10.0.0.11
FG_NHG_MEMBER|10.0.0.3
FG_NHG_MEMBER|10.0.0.5
FG_NHG_MEMBER|10.0.0.7

root@sonic:/home/admin# redis-cli -n 4 HGETALL 'FG_NHG|2-VM-Sets'
1) "bucket_size"
2) "12"
root@sonic:/home/admin# redis-cli -n 4 HGETALL 'FG_NHG_PREFIX|10.10.10.10/32'
1) "FG_NHG"
2) "2-VM-Sets"
root@sonic:/home/admin# redis-cli -n 4 HGETALL 'FG_NHG_MEMBER|10.0.0.1'
1) "Bank"
2) "0"
3) "FG_NHG"
4) "2-VM-Sets"
root@sonic:/home/admin# redis-cli -n 4 HGETALL 'FG_NHG_MEMBER|10.0.0.3'
1) "Bank"
2) "0"
3) "FG_NHG"
4) "2-VM-Sets"
root@sonic:/home/admin# redis-cli -n 4 HGETALL 'FG_NHG_MEMBER|10.0.0.5'
1) "Bank"
2) "1"
3) "FG_NHG"
4) "2-VM-Sets"
root@sonic:/home/admin# redis-cli -n 4 HGETALL 'FG_NHG_MEMBER|10.0.0.7'
1) "Bank"
2) "1"
3) "FG_NHG"
4) "2-VM-Sets"
root@sonic:/home/admin# redis-cli -n 4 HGETALL 'FG_NHG_MEMBER|10.0.0.9'
1) "Bank"
2) "2"
3) "FG_NHG"
4) "2-VM-Sets"
root@sonic:/home/admin# redis-cli -n 4 HGETALL 'FG_NHG_MEMBER|10.0.0.11'
1) "Bank"
2) "2"
3) "FG_NHG"
4) "2-VM-Sets"

root@sonic:/home/admin# redis-cli -n 6 HGETALL 'FG_ROUTE_TABLE|10.10.10.10/32'
  1) "0"
  2) "10.0.0.1@Ethernet0"
  3) "1"
  4) "10.0.0.3@Ethernet4"
  5) "2"
  6) "10.0.0.5@Ethernet8"
  7) "3"
  8) "10.0.0.7@Ethernet12"
  9) "4"
 10) "10.0.0.9@Ethernet16"
...
240) "10.0.0.11@Ethernet20"
241) "120"
242) "10.0.0.1@Ethernet0"
243) "121"
244) "10.0.0.3@Ethernet4"
245) "122"
246) "10.0.0.5@Ethernet8"
247) "123"
248) "10.0.0.7@Ethernet12"
249) "124"
250) "10.0.0.9@Ethernet16"
251) "125"
252) "10.0.0.11@Ethernet20"
253) "126"
254) "10.0.0.1@Ethernet0"
255) "127"
256) "10.0.0.3@Ethernet4"

Describe the results you expected:
Banks and ranges should be calculated according to the configured values.
Flows distribution should happen only within the configured banks.

Additional information you deem important (e.g. issue happens only occasionally):
System log: syslog.txt
Redis dump: config.txt
Automation test passes: sonic-net/sonic-mgmt#1788

Output of show version:

(paste your output here)

Attach debug file sudo generate_dump:

(paste your output here)
@anish-n
Copy link
Contributor

anish-n commented Oct 19, 2020

@nazariig can you try with "bank" instead of "Bank" note lack of capitalization of B, I will fix the schema/example in the documentation.
Upon changing to "bank", can you share the following if you still observe issues:

  1. what Ethernet16 was configured as? Vlan, regular L3 interface etc?
  2. Do you observe invalid flow distribution without disabling Ethernet16?

@nazariig
Copy link
Collaborator Author

@anish-n Sure. I will check and update. Also, can you please elaborate on the following:
According to HLD we have:

FG_ROUTE_TABLE|{{IPv4 OR IPv6 prefix}}:
    "0": {{next-hop-key}}
    "1": {{next-hop-key}}
    ...
    "{{hash_bucket_size -1}}": {{next-hop-key}}

In the example above we have:

root@sonic:/home/admin# redis-cli -n 6 HGETALL 'FG_ROUTE_TABLE|10.10.10.10/32'
  1) "0"
  2) "10.0.0.1@Ethernet0"
  3) "1"
  4) "10.0.0.3@Ethernet4"
  5) "2"
  6) "10.0.0.5@Ethernet8"
  7) "3"
  8) "10.0.0.7@Ethernet12"
  9) "4"
 10) "10.0.0.9@Ethernet16"
...
240) "10.0.0.11@Ethernet20"
241) "120"
242) "10.0.0.1@Ethernet0"
243) "121"
244) "10.0.0.3@Ethernet4"
245) "122"
246) "10.0.0.5@Ethernet8"
247) "123"
248) "10.0.0.7@Ethernet12"
249) "124"
250) "10.0.0.9@Ethernet16"
251) "125"
252) "10.0.0.11@Ethernet20"
253) "126"
254) "10.0.0.1@Ethernet0"
255) "127"
256) "10.0.0.3@Ethernet4"

Which is totally different, since i would expect to see something like 0,1,2,3...11 but not 0,1,2...127. Is it expected?

@anish-n
Copy link
Contributor

anish-n commented Oct 19, 2020

@nazariig 0...127 is expected, since the real hash bucket size in hardware is 128(vs 12 requested by the user in config_db entry), so fg_ecmp orch populates next-hop entries per the 128 size(vs 12 size).

@nazariig
Copy link
Collaborator Author

@anish-n I have retested with the change you suggested - no issues observed. Please update the HLD with working example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants