You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A buffer pool object can be removed before its counter is removed even if orchagent removes the counter first.
This defect can occur on all objects that have a counter attached. This is because orchagent notifies sairedis to remove an object and the counter via different channels. There is no mechanism to keep the order between OA and sairedis. For object, it uses ASIC_DB channel but for a counter it uses FLEX_DB.
It's a rare case for the buffer pool object. We observed it once when the zero buffer pool (for reclaiming buffer) is removed (all ports are started up) just after system warm reboot.
The zero buffer pool was removed:
But in the log we see the counter was still accessed and removed after the buffer pool had been removed
Dec 16 06:32:30.469317 r-spider-05 INFO syncd#SDK: :- processSingleEvent: key: SAI_OBJECT_TYPE_BUFFER_POOL:oid:0x18000000000a3d op: remove
Dec 16 06:32:30.469317 r-spider-05 NOTICE syncd#SDK: [SAI_BUFFER.NOTICE] ./src/mlnx_sai_buffer.c[2221]- mlnx_sai_remove_buffer_pool: Remove BUFFER_POOL [OID:0x400000018] [sx_cos_pool_id:4]
Dec 16 06:32:30.470509 r-spider-05 INFO syncd#SDK: :- sendApiResponse: sending response for SAI_COMMON_API_REMOVE api with status: SAI_STATUS_SUCCESS
Dec 16 06:32:30.722009 r-spider-05 INFO syncd#SDK: :- tryTranslateVidToRid: unable to get RID for VID oid:0x18000000000a3d
Dec 16 06:32:30.722061 r-spider-05 WARNING syncd#SDK: :- processFlexCounterEvent: port VID oid:0x18000000000a3d, was not found (probably port was removed/splitted) and will remove from counters now
(paste your output here or download and attach the file here )
Additional information you deem important (e.g. issue happens only occasionally):
The text was updated successfully, but these errors were encountered:
stephenxs
changed the title
[Buffer pool] A buffer pool object can be removed before its counter is removed even if orchagent removes the counter first
[Flex counter] A buffer pool object can be removed before its counter is removed even if orchagent removes the counter first
Dec 27, 2023
lguohan
transferred this issue from sonic-net/sonic-buildimage
Jan 3, 2024
Description
A buffer pool object can be removed before its counter is removed even if orchagent removes the counter first.
This defect can occur on all objects that have a counter attached. This is because orchagent notifies sairedis to remove an object and the counter via different channels. There is no mechanism to keep the order between OA and sairedis. For object, it uses ASIC_DB channel but for a counter it uses FLEX_DB.
This issue is very similar to sonic-net/sonic-buildimage#14628 which is for the RIF object.
Steps to reproduce the issue:
It's a rare case for the buffer pool object. We observed it once when the zero buffer pool (for reclaiming buffer) is removed (all ports are started up) just after system warm reboot.
The zero buffer pool was removed:
According to the code, it will remove object first
In
clearBufferPoolWatermarkCounterIdList
it removes entry in FLEX_COUNTER_DBBut in the log we see the counter was still accessed and removed after the buffer pool had been removed
Describe the results you received:
Describe the results you expected:
Output of
show version
:Output of
show techsupport
:Additional information you deem important (e.g. issue happens only occasionally):
The text was updated successfully, but these errors were encountered: