Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flex counter out-of-order issue by notifying counter operations using SelectableChannel #3076

Merged
merged 7 commits into from
May 15, 2024

Conversation

stephenxs
Copy link
Collaborator

@stephenxs stephenxs commented Mar 13, 2024

What I did

Fix flex counter out-of-order issue by notifying counter operations using SelectableChannel
Depends on sonic-net/sonic-sairedis#1362

Why I did it

Currently, the operations of SAI objects and their counters (if any) are triggered by different channels, which introduces racing conditions:

  • the creation and destruction of the objects are notified using the SelectableChannel,
  • the operations of counters, including starting and stopping polling the counters, are notified by listening to the FLEX_COUNTER and FLEX_COUNTER_GROUP tables in the FLEX_COUNTER_DB
  • The orchagent always respects the order when starting/stopping counter-polling (which means to start counter-polling after creating the object and to stop counter-polling before destroying the object) but syncd can receive events in a wrong order, eg. it receives destroying an object first and then stopping counter polling on the object, it can poll counter for a non-exist object, which causes errors in vendor SAI.

The new solution is to extend SAI redis attributes on the SAI_SWITCH_OBJECT to notify counter polling. As a result, all the objects and their counters are notified using a unified channel, the SelectableChannel.

How I verified it

Manual test
Regression test
Mock test

Details if related

  1. Extend SAI redis attribute on the switch object to identify counter-polling operations.
  2. Introduce a CLI option of orchagent to identify whether the new or the old approach is used for notifying counter-polling. It can not be changed on the fly.
  3. Move the logic to notify counter polling from each orchagent class and flex counter manager class to a commonplace. The logic is:
    • For the new approach, notify counter polling using SAI extended redis attribute
    • For the old approach, notify counter-polling using flex database tables. The corresponding flex database table objects (ProducerTable) are initialized during orchagent initialization, before any flex counter operations.
  4. Remove the definition of and the references to flex counter and flex counter group table in each orchagent class (usually m_flexCounterTable and m_flexCounterGroupTable)
  5. Improve the logic to load the counters' Lua plugin for mock test. The Lua scripts are not available in the mock test scenarios, which causes exceptions to load them in mock test. In case it fails to load the Lua plugin,
    • Originally, it skipped any flex counter group operations.
    • Now, it will leave the SHA code of the Lua plugin empty, and continue to notify the flex counter operations to SAI redis.
      • In the production scenario, where it can hardly happen, the plugin field will not be notified to SAI.
      • In the mock test scenario, it will notify SAI as it is. This is for mock test to verify whether the plugin field is provided as expected.
  6. Gearbox counter handling.
    • Maintain a mapping from each OID created based on the gearbox, from the object ID itself to the gearbox's OID.
    • To notify counter operations for such OIDs in the new approach, we need to first fetch the corresponding gearbox OID from the mapping and the notify the gearbox OID as switch OID.
    • The old approach looks buggy.
      • There can be more than one gearbox syncd docker in the system, which is a P2MP scenario, the OA needs to communicate to multiple gearbox syncd dockers.
      • The old approach leverages ConsumerTable, ProducerTable mechanism to communicate between OA and sairedis. However, it works for P2P scenarios only. There is a logic for ConsumerTable to consume the update once a gearbox syncd sees it, leaving all rest gearbox syncd daemons to see nothing. As a result, for each update there is only one gearbox syncd that sees and handles it.
  7. There is a logic in intfsorch, copporch and vxlanorch which is to add a counter for an object only after it occurs in the VIRTORID table, which is a WA of the counter out-of-order issue. Now that the issue has been fixed in the new approach, it does not need to check anymore.

Performance analysis
The counter operations are handled in the same thread in both the new and old solutions.
In swss, the counter operation was asynchronous in the old solution and is synchronous now, which can introduce a bit more latency. However, as the number of counter operations is small, no performance degradation is observed.

@kcudnik
Copy link
Contributor

kcudnik commented Mar 22, 2024

Please fix build issues

@stephenxs
Copy link
Collaborator Author

stephenxs commented Mar 23, 2024

Please fix build issues

Hi @kcudnik
It depends on the sairedis PR (sonic-net/sonic-sairedis#1362) to define extended attributes.

@kcudnik
Copy link
Contributor

kcudnik commented Mar 23, 2024

Will review

@stephenxs stephenxs force-pushed the poc-flexcounter-new-infra branch 2 times, most recently from 15df98e to 8a349fc Compare March 28, 2024 10:16
@stephenxs
Copy link
Collaborator Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephenxs
Copy link
Collaborator Author

/azpw run

@stephenxs stephenxs force-pushed the poc-flexcounter-new-infra branch 2 times, most recently from b1303d5 to 124600f Compare March 30, 2024 02:21
@stephenxs
Copy link
Collaborator Author

/azpw run

@stephenxs
Copy link
Collaborator Author

stephenxs commented Mar 31, 2024

As mentioned in the description, the PR depends on sairedis PR (sonic-net/sonic-sairedis#1362) merged a week ago.
However, the sairedis Debian packages fetched in the build were TOO OLD to satisfy the dependency.

##[debug]ProjectId: be1b070f-be15-4154-aade-b1d3bfb17054
##[debug]PipelineVersionToDownload: latestFromBranch
##[debug]Agent environment resources - Disk: / Available 25389.00 MB out of 29593.00 MB, Memory: Used 701.00 MB out of 15968.00 MB, CPU: Usage 2.53%
##[debug]PipelineId from non-trigerringBuild: 495079
Download from the specified build: #495079
##[debug]Processed: ##vso[task.setvariable variable=BuildNumber;issecret=False;]495079
Download artifact to: /data/work/1/a/download/sairedis
Using default max parallelism.

It fetched 495079 in the build. However, it was built on 9th March, more than 2 weeks before my sairedis PR. I think this is because there has been no successful build since 495079. Can anyone help to look into it?

511120	20240330.8	master	Azure.sonic-sairedis	failed	2024-03-30T09:00:59	2024-03-30T10:55:25	594944dbe4	[Build Link](https://dev.azure.com/mssonic/be1b070f-be15-4154-aade-b1d3bfb17054/_build/results?buildId=511120)	[Artifacts](https://sonic-build.azurewebsites.net/ui/sonic/pipelines/12/builds/511120/artifacts?branchName=master)
508223	20240326.3	master	Azure.sonic-sairedis	failed	2024-03-26T14:26:18	2024-03-26T15:20:54	594944dbe4	[Build Link](https://dev.azure.com/mssonic/be1b070f-be15-4154-aade-b1d3bfb17054/_build/results?buildId=508223)	[Artifacts](https://sonic-build.azurewebsites.net/ui/sonic/pipelines/12/builds/508223/artifacts?branchName=master)
505863	20240323.8	master	Azure.sonic-sairedis	failed	2024-03-23T09:01:19	2024-03-23T10:32:08	bb948f632b	[Build Link](https://dev.azure.com/mssonic/be1b070f-be15-4154-aade-b1d3bfb17054/_build/results?buildId=505863)	[Artifacts](https://sonic-build.azurewebsites.net/ui/sonic/pipelines/12/builds/505863/artifacts?branchName=master)
500443	20240316.8	master	Azure.sonic-sairedis	failed	2024-03-16T09:01:01	2024-03-16T10:48:54	bb948f632b	[Build Link](https://dev.azure.com/mssonic/be1b070f-be15-4154-aade-b1d3bfb17054/_build/results?buildId=500443)	[Artifacts](https://sonic-build.azurewebsites.net/ui/sonic/pipelines/12/builds/500443/artifacts?branchName=master)

orchagent/vxlanorch.cpp Outdated Show resolved Hide resolved
@stephenxs stephenxs force-pushed the poc-flexcounter-new-infra branch 2 times, most recently from 9a7400d to 7301ad0 Compare April 6, 2024 08:59
@stephenxs
Copy link
Collaborator Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephenxs
Copy link
Collaborator Author

Rebased to resolve the conflict.

Comment on lines +89 to +98
extern bool gTraditionalFlexCounter;

vector<sai_object_id_t> gGearboxOids;

unique_ptr<DBConnector> gFlexCounterDb;
unique_ptr<ProducerTable> gFlexCounterGroupTable;
unique_ptr<ProducerTable> gFlexCounterTable;
unique_ptr<DBConnector> gGearBoxFlexCounterDb;
unique_ptr<ProducerTable> gGearBoxFlexCounterGroupTable;
unique_ptr<ProducerTable> gGearBoxFlexCounterTable;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like all this is related, so it could be moved to struct/class, but it seems for now that this could be like that, since OA needs ad ground up refactoring to move all global stuff to proper classes/namespaces

Copy link
Contributor

@kcudnik kcudnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this will work, but some of changes i proposed would be desired

@@ -261,6 +272,20 @@ void initSaiApi()
sai_log_set(SAI_API_TWAMP, SAI_LOG_LEVEL_NOTICE);
}

void initFlexCounterTables()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, looks like a constructor task for proposed class :D

@@ -261,6 +272,20 @@ void initSaiApi()
sai_log_set(SAI_API_TWAMP, SAI_LOG_LEVEL_NOTICE);
}

void initFlexCounterTables()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be static inline ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need. It is called outside this compiling unit.

}
else
{
sai_s8_list.list = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehter use initSaiRedisCounterEmptyParameter or remove that function

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

{
if (sai_switch_api == nullptr)
{
// This can happen during destruction of the orchagent daemon.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why syncd operation counter would be notifying syncd when OA is destroyed ? sai_switch_api should be destroyed at the end, only scenario i can imaged, is that SAI API init failed and sai_switch_api was NULL yet, but in that case no other operations should be performed and OA should quit, if switch API is NULL you can do nothing, that means sairedis lib initialization failed, so you will not send any notification anywaahre, and such initialization failre should be logged with critical prioirty not just error (not this log)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for mock test scenario only.
In production scenario, the test if (sai_switch_api == nullptr) won't be true.

static inline void operateFlexCounterDbSingleField(std::vector<FieldValueTuple> &fvTuples,
const string &field, const string &value)
{
if (!field.empty() && !value.empty())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this prevention here for a reason? ot at some point OA can actually populate empty counter? if any of those is empty this should be logged as ERROR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also for mock test.
The common logic to install plugin is like this:

  1. load lua plugin and generate SHA code
  2. install the lua plugin into {plugin name: plugin SHA code} the flex counter group

In mock test scenario, it fails to load the Lua plugin, which leaves the SHA code empty string. But we still want the lua plugin to be installed into the flex counter group. By doing so, we can verify whether the plugin name is correct and meet the coverage requirement.

In production scenario, we will check whether either is empty, if so, we will not install the plugin. This is just a production and can not happen.

Comment on lines +922 to +925
if (gTraditionalFlexCounter)
{
operateFlexCounterGroupDatabase(group, poll_interval, stats_mode, plugin_name, plugins, operation, is_gearbox);
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, normally in the OOP, such cases is resolved in proper way, by having base abstract class, and pointer in that class in OA, then base on command line in OA whether we want to use traditional counters or new one, just one of 2 objects is createed and populate the pointer, then all operations are the same, add field to counder, add counter remove counter etc, and in all entire code there is no need for gTraditionalFlexCounter) check, and i saw is checked in multiple places, and old and new counter realise the same functionality at the end, but im aware that at this stage this conversion without refactoring OA maybe not possible, it should be designed from the start

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, look all the below functions, they all contain the same check, and adding new code could confuse unaware programmer, so moving this to 2 classes with one new and second ol would be great solution to that

@kcudnik
Copy link
Contributor

kcudnik commented May 4, 2024

@stephenxs can you please handle the conflict? @kcudnik thanks for the great feedback. assuming conflicts are handled, any further feedback on the PR or this can be approved from your end?

i added some potential refactoring changes that could be desired, but i approved current solution, but i don't have permission to approve that, you need @qiluo-msft approve or @prsunny approve

@dprital
Copy link
Collaborator

dprital commented May 6, 2024

@qiluo-msft @prsunny , Can you please approve and merge ?

@stephenxs stephenxs changed the title Fix flow counter out-of-order issue by notifying counter operations using SelectableChannel Fix flex counter out-of-order issue by notifying counter operations using SelectableChannel May 6, 2024
…nter polling and create counter group

SAI switch API is invoked to enable counter polling and create counter group
CLI option to choose whether the new infra should be used

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants