Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core dump in orchagent when assigning router interface to a vlan with untagged mode #2684

Open
justin-lu-lyc opened this issue Mar 20, 2019 · 7 comments
Assignees

Comments

@justin-lu-lyc
Copy link

justin-lu-lyc commented Mar 20, 2019

Description
Core dump in orchagent when assigning router interface to a vlan with untagged mode

Steps to reproduce the issue:

  1. Assign IP to Ethernet20 by editing /etc/sonic/config_db.json
  2. create vlan 108
    config vlan add 108
  3. add this router interface to vlan108 with untagged mode
    config vlan member add -u 108 Ethernet20

Describe the results you received:
Add router interface to a vlan with untagged mode is illegal, suppose it should reject this configuration.

Describe the results you expected:
Result : orchagent in swss container core dump
orchagent (terminated by SIGABRT (core dumped); not expected)

Additional information you deem important (e.g. issue happens only occasionally):

**Output of `show version`:**

```
(paste your output here)
```

**Attach debug file `sudo generate_dump`:**

```
(paste your output here)
```

syslog for this issue
accton_load_201811_offical_19_run_untag_vlan_in_router_interface.txt

@justin-lu-lyc
Copy link
Author

I thought the reason might be in line 241
sonic NOTICE syncd#syncd: :- exit_and_notify: sending switch_shutdown_request notification to OA

and in line 246, core dump happened
Nov 3 17:28:19.596339 sonic INFO swss.sh[2353]: 2016-11-03 17:28:19,595 INFO exited: orchagent (terminated by SIGABRT (core dumped); not expected)

@xinliu-seattle
Copy link
Contributor

Add router interface to a vlan - this is not supported operation. One team is working on the configuration validation. Once it is done, this will behave differently.

@justin-lu-lyc
Copy link
Author

Agree. We need to do more configuration validation in sonic-utility.
But I think the orchagent should not crash for a little wrong configuration, maybe it just print some error log and skip that setting?

@xinliu-seattle
Copy link
Contributor

Right, this behavior of crash is going to be changed once the config validation is done.

@jipanyang
Copy link
Collaborator

For now, the mitigation to this problem: sonic-net/sonic-swss#770

@justin-lu-lyc
Copy link
Author

Right, this behavior of crash is going to be changed once the config validation is done.

Will this feature be release in the end of March? I noticed that the configuration validation is listed in Roadmap SONiC.201903.

@justin-lu-lyc
Copy link
Author

For now, the mitigation to this problem: Azure/sonic-swss#770

Thanks for your reply. I use similar way to prevent orchagent from crash, but it leads to the mismatch between CONFIG_DB and ASIC_DB. So maybe the best way to resolve this issue is still configuration validation.

yxieca added a commit to yxieca/sonic-buildimage that referenced this issue Mar 1, 2023
…ance submodule head

utilities:
* a4f141f1 2023-01-10 | [sfputil] Firmware download/upgrade CLI support for QSFP-DD (sonic-net#1947) (sonic-net#2349) (HEAD -> 202205) [CliveNi]

swss-common:
* 41fcad8 2023-01-30 | Increase the netlink buffer size from 3MB to 16MB. (sonic-net#739) (HEAD -> 202205) [KISHORE KUNAL]

sairedis:
* 5ce9990 2023-02-27 | [Dual-ToR] update sai.profile with SAI_ADDITIONAL_MAC_ENABLED attribute if corresponding arg passed to syncd (sonic-net#1201) (HEAD -> 202205) [Andriy Yurkiv]
* 3c2e0c5 2023-02-23 | Use new value of STATE_DB FAST_REBOOT entry (sonic-net#1196) [Aryeh Feigin]
* fe7756f 2023-02-28 | [submodule][SAI]Advance SAI head (sonic-net#1210) (github/202205) [Richard.Yu]

platform-common:
* 321a8e7 2022-09-23 | Cdb fw upgrade (sonic-net#308) (HEAD -> 202205) [CliveNi]

swss:
* ceea558 2023-02-28 | [orchagent]: Get bridge port ID from orchagent cache instead of SAI API (sonic-net#2657) (HEAD -> 202205) [Lawrence Lee]
* bd04e24 2023-03-01 | [dualtor] Fix neighbor miss when mux is not ready (sonic-net#2676) (HEAD -> 202205) [Longxiang Lyu]
* 7d87a90 2023-02-28 | [ci] Fix pipeline error about team5 not found. (sonic-net#2684) [Liu Shilong]
* 93a924c 2023-02-27 | [aclorch] Fixed issue sonic-net#2204.Support IN_PORTS qualifer in MIRRORV6 table. (sonic-net#2668) [Rajkumar-Marvell]
* 9d87ec4 2023-02-23 | swss: Fix Invalid port oid messages generated because of voq counters. (sonic-net#2653) [Sambath Kumar Balasubramanian]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
yxieca added a commit that referenced this issue Mar 1, 2023
…ance submodule head (#14029)

utilities:
* a4f141f1 2023-01-10 | [sfputil] Firmware download/upgrade CLI support for QSFP-DD (#1947) (#2349) (HEAD -> 202205) [CliveNi]

swss-common:
* 41fcad8 2023-01-30 | Increase the netlink buffer size from 3MB to 16MB. (#739) (HEAD -> 202205) [KISHORE KUNAL]

sairedis:
* 5ce9990 2023-02-27 | [Dual-ToR] update sai.profile with SAI_ADDITIONAL_MAC_ENABLED attribute if corresponding arg passed to syncd (#1201) (HEAD -> 202205) [Andriy Yurkiv]
* 3c2e0c5 2023-02-23 | Use new value of STATE_DB FAST_REBOOT entry (#1196) [Aryeh Feigin]
* fe7756f 2023-02-28 | [submodule][SAI]Advance SAI head (#1210) (github/202205) [Richard.Yu]

platform-common:
* 321a8e7 2022-09-23 | Cdb fw upgrade (#308) (HEAD -> 202205) [CliveNi]

swss:
* ceea558 2023-02-28 | [orchagent]: Get bridge port ID from orchagent cache instead of SAI API (#2657) (HEAD -> 202205) [Lawrence Lee]
* bd04e24 2023-03-01 | [dualtor] Fix neighbor miss when mux is not ready (#2676) (HEAD -> 202205) [Longxiang Lyu]
* 7d87a90 2023-02-28 | [ci] Fix pipeline error about team5 not found. (#2684) [Liu Shilong]
* 93a924c 2023-02-27 | [aclorch] Fixed issue #2204.Support IN_PORTS qualifer in MIRRORV6 table. (#2668) [Rajkumar-Marvell]
* 9d87ec4 2023-02-23 | swss: Fix Invalid port oid messages generated because of voq counters. (#2653) [Sambath Kumar Balasubramanian]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
StormLiangMS added a commit that referenced this issue Mar 5, 2023
Why I did it
submodule advance for master branch

309df59 - Revert "[aclorch] Fixed issue [Mellanox] Update SDK to v4.2.9102 #2204.Support IN_PORTS qualifer in MIRRORV6 table. (Cmd "config vlan member add <vid> <interface_name>" always adds interface as tagged #2668)" (Add warm/fast-boot feature processing for wedge100bf_32x/65x platforms #2687) (85 minutes ago) [StormLiangMS]
ebe8de7 - [FDB]Fixing FDB consolidated flush for Remote MACs (pmon to stretch #2673) (2 days ago) [Sudharsan Dhamal Gopalarathnam]
c9ae6aa - Fix issue: there is no retry while creating a RIF which is in removing state ([201811 sub-module] advance sub-modules: utilities, swss, swss-common #2679) (2 days ago) [Junchao-Mellanox]
79afcb3 - [Dual-ToR] handle 'mux_tunnel_egress_acl' attrib in order to change ACL configuration (drop on ingress/egress) on standby ToR (lm75 doesn't support written alarm to syslog. #2646) (3 days ago) [Andriy Yurkiv]
c2b01ba - [orchagent]: Get bridge port ID from orchagent cache instead of SAI API ([201811 sub module] advance sairedis sub module #2657) (3 days ago) [Lawrence Lee]
d8a1cb7 - [dualtor] Fix neighbor miss when mux is not ready ([mellanox] Fix in mlnx-ffb.sh #2676) (3 days ago) [Longxiang Lyu]
1531dff - [ci] Fix pipeline error about team5 not found. (Core dump in orchagent when assigning router interface to a vlan with untagged mode  #2684) (4 days ago) [Liu Shilong]
cfcd40c - [aclorch] Fixed issue [Mellanox] Update SDK to v4.2.9102 #2204.Support IN_PORTS qualifer in MIRRORV6 table. (Cmd "config vlan member add <vid> <interface_name>" always adds interface as tagged #2668) (4 days ago) [Rajkumar-Marvell]
35a7ab0 - swss: Fix Invalid port oid messages generated because of voq counters. (Failed to update FlexCounter, Segmentation fault #2653) (8 days ago) [Sambath Kumar Balasubramanian]
How I did it
How to verify it
run PR test
StormLiangMS added a commit that referenced this issue Mar 8, 2023
Why I did it
submodule advance

b085b5f - [ci] Fix pipeline error about team5 not found. (Core dump in orchagent when assigning router interface to a vlan with untagged mode  #2684) (3 hours ago) [Liu Shilong]
4549b4c - Fix issue: there is no retry while creating a RIF which is in removing state ([201811 sub-module] advance sub-modules: utilities, swss, swss-common #2679) (3 hours ago) [Junchao-Mellanox]
980a45b - [FDB]Fixing FDB consolidated flush for Remote MACs (pmon to stretch #2673) (3 hours ago) [Sudharsan Dhamal Gopalarathnam]
c646607 - Do not allow to add port to .1Q bridge while router port deletion is not completed (Update SDK, FW and SAI #2669) (3 hours ago) [Lior Avramov]
4a321f0 - [orchagent]: Get bridge port ID from orchagent cache instead of SAI API ([201811 sub module] advance sairedis sub module #2657) (3 hours ago) [Lawrence Lee]
f4b88f3 - [Dual-ToR] handle 'mux_tunnel_egress_acl' attrib in order to change ACL configuration (drop on ingress/egress) on standby ToR (lm75 doesn't support written alarm to syslog. #2646) (3 hours ago) [Andriy Yurkiv]
a4f29c1 - [Workaround] EvpnRemoteVnip2pOrch warmboot check failure ([teamd]: wait for swss db flush done before starting teamd container #2626) (3 hours ago) [jcaiMR]
53ee0a8 - Support for tc-dot1p and tc-dscp qosmap ([201803] [router-advertiser] Add templated script to wait for pertinent interfaces to be ready before starting radvd #2559) (3 hours ago) [Divya Mukundan]
b953866 - [dual-tor] add missing SAI attribte in order to create IPNIP tunnel (Config reload/load_minigraph not clearing State DB #2503) (3 hours ago) [Andriy Yurkiv]
How I did it
How to verify it
xumia pushed a commit to xumia/sonic-buildimage-1 that referenced this issue Mar 10, 2023
Why I did it
submodule advance for master branch

309df59 - Revert "[aclorch] Fixed issue [Mellanox] Update SDK to v4.2.9102 sonic-net#2204.Support IN_PORTS qualifer in MIRRORV6 table. (Cmd "config vlan member add <vid> <interface_name>" always adds interface as tagged sonic-net#2668)" (Add warm/fast-boot feature processing for wedge100bf_32x/65x platforms sonic-net#2687) (85 minutes ago) [StormLiangMS]
ebe8de7 - [FDB]Fixing FDB consolidated flush for Remote MACs (pmon to stretch sonic-net#2673) (2 days ago) [Sudharsan Dhamal Gopalarathnam]
c9ae6aa - Fix issue: there is no retry while creating a RIF which is in removing state ([201811 sub-module] advance sub-modules: utilities, swss, swss-common sonic-net#2679) (2 days ago) [Junchao-Mellanox]
79afcb3 - [Dual-ToR] handle 'mux_tunnel_egress_acl' attrib in order to change ACL configuration (drop on ingress/egress) on standby ToR (lm75 doesn't support written alarm to syslog. sonic-net#2646) (3 days ago) [Andriy Yurkiv]
c2b01ba - [orchagent]: Get bridge port ID from orchagent cache instead of SAI API ([201811 sub module] advance sairedis sub module sonic-net#2657) (3 days ago) [Lawrence Lee]
d8a1cb7 - [dualtor] Fix neighbor miss when mux is not ready ([mellanox] Fix in mlnx-ffb.sh sonic-net#2676) (3 days ago) [Longxiang Lyu]
1531dff - [ci] Fix pipeline error about team5 not found. (Core dump in orchagent when assigning router interface to a vlan with untagged mode  sonic-net#2684) (4 days ago) [Liu Shilong]
cfcd40c - [aclorch] Fixed issue [Mellanox] Update SDK to v4.2.9102 sonic-net#2204.Support IN_PORTS qualifer in MIRRORV6 table. (Cmd "config vlan member add <vid> <interface_name>" always adds interface as tagged sonic-net#2668) (4 days ago) [Rajkumar-Marvell]
35a7ab0 - swss: Fix Invalid port oid messages generated because of voq counters. (Failed to update FlexCounter, Segmentation fault sonic-net#2653) (8 days ago) [Sambath Kumar Balasubramanian]
How I did it
How to verify it
run PR test
mihirpat1 pushed a commit to mihirpat1/sonic-buildimage that referenced this issue Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants