-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[crmorch] orchagent crash when destructing crmorch #1991
Labels
Comments
We are able to reproduce the issue and fix plan is in-progress. |
Thought of updating the existing bug instead of raising new issue, I am attaching my config.json file and the crash dump. config-reload-orchagent-crash.txt Please let me know if more information is needed. |
dgsudharsan
added a commit
to dgsudharsan/sonic-buildimage
that referenced
this issue
Nov 4, 2021
Including the below commits to update swss submodule 8448a60 [vs tests]Migrating sonic-swss tests to use hwsku instead of fakeplatform (sonic-net#1978) faa26db Fix random failure in PR/CI build. (sonic-net#2006) e03edb6 Allow interface type value none (sonic-net#1991) 71b9650 [orchagent] Fix group name of port-buffer-drop in flexcounterorch.cpp (sonic-net#1967) facdef5 [VS test] Skip flaky virtual chassis test (sonic-net#2004) 8261c1f [pytest]: Increase timeout when checking services (sonic-net#2000) 67278be [teammgrd]: Handle LAGs cleanup gracefully on Warm/Fast reboot. (sonic-net#1934) e92c1df Enable FEC statistics collection for Ethernet ports (sonic-net#1994) 9f30ca1 VxLAN Tunnel Counters and Rates implementation (sonic-net#1859) Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
prsunny
pushed a commit
that referenced
this issue
Nov 5, 2021
Including the below commits to update swss submodule 8448a60 [vs tests]Migrating sonic-swss tests to use hwsku instead of fakeplatform (#1978) faa26db Fix random failure in PR/CI build. (#2006) e03edb6 Allow interface type value none (#1991) 71b9650 [orchagent] Fix group name of port-buffer-drop in flexcounterorch.cpp (#1967) facdef5 [VS test] Skip flaky virtual chassis test (#2004) 8261c1f [pytest]: Increase timeout when checking services (#2000) 67278be [teammgrd]: Handle LAGs cleanup gracefully on Warm/Fast reboot. (#1934) e92c1df Enable FEC statistics collection for Ethernet ports (#1994) 9f30ca1 VxLAN Tunnel Counters and Rates implementation (#1859) Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
stepanblyschak
added a commit
to stepanblyschak/sonic-buildimage
that referenced
this issue
Nov 11, 2021
``` 5f8ebfa (HEAD, origin/master, origin/HEAD, master) [AclOrch] move ACL counters to flex counter infrastructure (sonic-net#1943) 8119ec0 [bfdorch] Orchagent support hardware BFD (sonic-net#1883) 15074ac [sonic-swss]:enable unconfiguring PFC on last TC on a port (sonic-net#1962) 05c7c05 [Mux orch] set default as standby, change mux orch priority (sonic-net#2010) fe5b2a9 [pytest]: Ignore errors deleting host ifs (sonic-net#2005) 70da9af [ci]: use native arm64 and armhf pool (sonic-net#2013) e14a071 [qos] Add EXP to TC map support (sonic-net#1954) c91a7f2 [switchorch] Implement VXLAN src port range feature (sonic-net#1959) b20f0f4 Gcov for swss daemon (sonic-net#1737) 01c243a [CRM][MPLS] Fix the mpls nexthop CRM attribute (sonic-net#2008) 8448a60 [vs tests]Migrating sonic-swss tests to use hwsku instead of fakeplatform (sonic-net#1978) faa26db Fix random failure in PR/CI build. (sonic-net#2006) e03edb6 Allow interface type value none (sonic-net#1991) 71b9650 [orchagent] Fix group name of port-buffer-drop in flexcounterorch.cpp (sonic-net#1967) facdef5 [VS test] Skip flaky virtual chassis test (sonic-net#2004) 8261c1f [pytest]: Increase timeout when checking services (sonic-net#2000) 67278be [teammgrd]: Handle LAGs cleanup gracefully on Warm/Fast reboot. (sonic-net#1934) e92c1df Enable FEC statistics collection for Ethernet ports (sonic-net#1994) 9f30ca1 VxLAN Tunnel Counters and Rates implementation (sonic-net#1859) ac3103a Add missing neighbor resolution for MPLS route programming (sonic-net#1968) bfba0ad [vlanmgr]Fix for STATE_DB port check logic (sonic-net#1980) 9ef2ba4 [vlanmgr]: Update VLAN removal code to work with 5.10 kernel and newer iproute2 versions (sonic-net#1970) 41fb26c [Mux orch] Handle setting unknown mux state (sonic-net#1984) ac09bde [azp]: Increase timeout for VS tests (sonic-net#1988) da8a43e [pytest]: Check if appl DB exists before deleting (sonic-net#1983) 553d75a [tunnel decap] Change tunnel orch order (sonic-net#1977) 7444e96 [macsecmgr]: Add rekey period in macsec mgr (sonic-net#1958) d95823d [Buffermgr]Graceful handling of buffer model change (sonic-net#1956) b0aa6a0 EVPN VxLAN enhancement to support P2MP tunnel based programming for Layer2 extension (sonic-net#1858) 85bdf54 Fix the option missing in kernel config issue (sonic-net#1973) 6b15584 Orchagent validates mirror session queue parameter against maximum value from SAI (sonic-net#1957) fc9ffb9 [copp] Add ISIS, LDP and micro-BFD trap types to CoPP manager (sonic-net#1890) 452cbc1 [macsecorch]: Add IPG adjusting for MACsec gearbox model (sonic-net#1925) ``` Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
theasianpianist
pushed a commit
to theasianpianist/sonic-buildimage
that referenced
this issue
Feb 5, 2022
*Allow user to set none value for interface type
liat-grozovik
pushed a commit
that referenced
this issue
May 12, 2022
288c2d8 Revert "[scripts/fast-reboot] Shutdown remaining containers through systemd (#2133)" (#2161) bce4694 [autoneg] add support for remote speed advertisement (#2124) a73f156 [show][vrf]Fixing show vrf to include vlan subinterface (#2158) 7a06457 [auto_ts] Enable register/de-register auto_ts config for APP Extension (#2139) 083ebcc Add transceiver-info items advertised for cmis-supported moddules (#2135) 0811214 Validate destination port is not LAG (#2053) 6ab1c51 [minigraph] Consume golden_config_db.json while loading minigraph (#2140) c37a957 [Kdump] Remove the duplicate logic if Kdump was disabled (#2128) 1143869 Ordering fix for sfpshow eeprom (#2113) fdb79b8 Allow fw update for other boot type against on the previous "none" boot fw update (#2040) a54a091 [GCU] Supressing YANG errors from libyang while sorting (#1991) fbfa8bc [GCU] Enabling AddRack and adding RemoveRack tests (#2143) d012be9 [Command-Reference] Add CLI docs for route flow counter (#2069) 8c07d59 [Mellanox] [reboot] [asan] stop asan-enabled containers on reboot (#2107) 697aae3 Fix speed parsing when speed is NOT fetched from APPL_DB (#2138) 22a388b [show] fix get routing stack routine (#2137) cb3a047 Support option --ports of config qos reload for reloading ports' QoS and buffer configuration to default (#2125) 154a801 Enhance "config interface type/advertised-type" to be blocked on RJ45 ports (#2112) 3732ac5 Add CLI for route flow counter feature (#2031) 29771e7 [techsupport] improve robustness (#2117) f9dc681 [intfutil] Display RJ45 port and portchannel speed in 'M' instead of 'G' when it's <= 1000M (#2110) 781ae9f [config] Do not enable pfcwd for BmcMgmtToRRouter (#2136) 23e9398 [scripts/fast-reboot] Shutdown remaining containers through systemd (#2133) 576c9ef [scripts/fast-reboot] stop timers in advance (#2131) 4dad79c bugfix: incorrect command for portchannel creation (#2134) c17b1f4 [show][muxcable] Decrease the timeout for show mux status/hwmode (#2130) 49d61f8 [scripts/fast-reboot] cleanup (#2132) 52ca324 [config/config_mgmt.py]: Fix dpb issue with upper case mac in (#2066) 9e2fbf4 Update db_migrator to support `pfcwd_sw_enable` (#2087) 4010bd0 FGNHG CLI changes (#1588) 6bd54d0 Fix 'show mac' output when FDB entry for default vlan is None instead of 1 (#2126)
liushilongbuaa
pushed a commit
to liushilongbuaa/sonic-buildimage
that referenced
this issue
Jun 20, 2022
…anch Related work items: #52, #71, #73, #75, #77, sonic-net#1306, sonic-net#1588, sonic-net#1991, sonic-net#2031, sonic-net#2040, sonic-net#2053, sonic-net#2066, sonic-net#2069, sonic-net#2087, sonic-net#2107, sonic-net#2110, sonic-net#2112, sonic-net#2113, sonic-net#2117, sonic-net#2124, sonic-net#2125, sonic-net#2126, sonic-net#2128, sonic-net#2130, sonic-net#2131, sonic-net#2132, sonic-net#2133, sonic-net#2134, sonic-net#2135, sonic-net#2136, sonic-net#2137, sonic-net#2138, sonic-net#2139, sonic-net#2140, sonic-net#2143, sonic-net#2158, sonic-net#2161, sonic-net#2233, sonic-net#2243, sonic-net#2250, sonic-net#2254, sonic-net#2260, sonic-net#2261, sonic-net#2267, sonic-net#2278, sonic-net#2282, sonic-net#2285, sonic-net#2288, sonic-net#2289, sonic-net#2292, sonic-net#2294, sonic-net#8887, sonic-net#9279, sonic-net#9390, sonic-net#9511, sonic-net#9700, sonic-net#10025, sonic-net#10322, sonic-net#10479, sonic-net#10484, sonic-net#10493, sonic-net#10500, sonic-net#10580, sonic-net#10595, sonic-net#10628, sonic-net#10634, sonic-net#10635, sonic-net#10644, sonic-net#10670, sonic-net#10691, sonic-net#10716, sonic-net#10731, sonic-net#10750, sonic-net#10751, sonic-net#10752, sonic-net#10761, sonic-net#10769, sonic-net#10775, sonic-net#10776, sonic-net#10779, sonic-net#10786, sonic-net#10792, sonic-net#10793, sonic-net#10800, sonic-net#10806, sonic-net#10826, sonic-net#10839, sonic-net#10840, sonic-net#10842, sonic-net#10844, sonic-net#10847, sonic-net#10849, sonic-net#10852, sonic-net#10865, sonic-net#10872, sonic-net#10877, sonic-net#10886, sonic-net#10889, sonic-net#10903, sonic-net#10904, sonic-net#10905, sonic-net#10913, sonic-net#10914, sonic-net#10916, sonic-net#10919, sonic-net#10925, sonic-net#10926, sonic-net#10929, sonic-net#10933, sonic-net#10934, sonic-net#10937, sonic-net#10941, sonic-net#10947, sonic-net#10952, sonic-net#10953, sonic-net#10957, sonic-net#10959, sonic-net#10971, sonic-net#10972, sonic-net#10980
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
I meet a problem that coredump happened in rebooting system.
I think it should be a mistake m_timer(class CrmOrch) using smart pointer.
The SelectableTimer object m_timer pointed is referenced by ExecutableTimer using raw pointer(Executor::m_selectable).
In destructing CrmOrch, m_timer destructed first(had release SelectableTimer object memory), and then base class member m_consumerMap destructed, so ~Executor() tried to delete the SelectableTimer object again.
I am not sure how error config lead to jump out of while loop. It should be an exception occured.
Steps to reproduce the issue:
"LOOPBACK_INTERFACE": {
"Loopback2|101.101.101.101/32": {},
"Loopback2|101.101.1.1/32": {},
"Loopback2|101.101.1.2/32": {},
"Loopback3|101.101.1.3/32": {},
...
},
Describe the results you received:
Describe the results you expected:
Additional information you deem important (e.g. issue happens only occasionally):
The text was updated successfully, but these errors were encountered: