Skip to content

[syncd] Update log level for bulk api #1492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 14, 2025

Conversation

jianyuewu
Copy link
Contributor

@jianyuewu jianyuewu commented Jan 3, 2025

Log level should be warning instead of error, because functionality is fine. Latter code will fall back to use single api instead of bulk api, when syncd support bulk but sai doesn't support it.

Background
When syncd added support for bulk APIs like bulk set, but sai didn't implement it, this error will be printed:
"processBulkOidSet: bulk set api is not implemented or not supported, object_type = SAI_OBJECT_TYPE_QUEUE", and it will fail the cases. Actually the print should be warning, as in current code, bulk set fall back to single set in latter code, so sai will still receive single set APIs, so functionality is actually OK.

What I did
Update log level from error to warning.

How to verify it
When bulk set api is added in syncd, but sai did not implement it, only warning logs will be printed, instead of error logs.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

kcudnik
kcudnik previously approved these changes Jan 4, 2025
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jianyuewu
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jianyuewu
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jianyuewu
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@keboliu keboliu requested a review from liat-grozovik January 20, 2025 02:44
@jianyuewu
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jianyuewu
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jianyuewu
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -78,7 +78,7 @@ void syncdMlnxWorkerThread()
commandLineOptions->m_enableTempView = false;
commandLineOptions->m_disableExitSleep = true;
commandLineOptions->m_enableUnittests = false;
commandLineOptions->m_enableSaiBulkSupport = true;
commandLineOptions->m_enableSaiBulkSupport = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to leverage the pipeline to check whether coverage can pass in this way. we will have a formal solution if it can.
it's not easy for us to check coverage locally.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can actually do it in docker, it will cost you some time to setup this locally, and then you can run gcovr following steps taht are located in *yml files

i don;t have this steps locally, but i usually write unittests for code that i add/change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. indeed. we usually toggle breakpoints and check whether breakpoints are reached.
but sometimes even if the breakpoints are reached, coverage reports uncovered..
we will try steps in yaml.

@jianyuewu
Copy link
Contributor Author

Looks inside this function in official code, there is some bug:

bool VirtualOidTranslator::tryTranslateRidToVid(
In sai_object_id_t rid,
Out sai_object_id_t &vid)
{
...
auto it = m_rid2vid.find(vid); // Maybe m_rid2vid.find(rid); ?
...
@kcudnik @stephenxs Local UT seems failed here, could I update it in some other ticket or is it an issue?

@kcudnik
Copy link
Collaborator

kcudnik commented Feb 12, 2025

whats the erro you get ?

yes that should be rid there, and it needs to be fixed, by lucy coincidence this is always go to client oinstead of cache

@jianyuewu
Copy link
Contributor Author

Got this error:TestSyncd.cpp:476: Failure
Actual function call count doesn't match EXPECT_CALL(*channel, empty())...
Expected: to be called at least once
Actual: never called - unsatisfied and active
unknown file: Failure
C++ exception with description ":- postPortRemove: expected rid oid:0x0 to be present in RIDTOVID" thrown in the test body.
[Thread 0x7ffff6cc86c0 (LWP 29131) exited]
[ FAILED ] SyncdTest.BulkRemoveNotImplemented (34118 ms)
Yes, maybe this error not related with m_rid2vid, as I later updated it, seems no impact.

@kcudnik
Copy link
Collaborator

kcudnik commented Feb 13, 2025

Got this error:TestSyncd.cpp:476: Failure Actual function call count doesn't match EXPECT_CALL(*channel, empty())... Expected: to be called at least once Actual: never called - unsatisfied and active unknown file: Failure C++ exception with description ":- postPortRemove: expected rid oid:0x0 to be present in RIDTOVID" thrown in the test body. [Thread 0x7ffff6cc86c0 (LWP 29131) exited] [ FAILED ] SyncdTest.BulkRemoveNotImplemented (34118 ms) Yes, maybe this error not related with m_rid2vid, as I later updated it, seems no impact.

is taht error locally only ? since on pipeline all tests pass ok

@jianyuewu
Copy link
Contributor Author

jianyuewu commented Feb 14, 2025

Got this error:TestSyncd.cpp:476: Failure Actual function call count doesn't match EXPECT_CALL(*channel, empty())... Expected: to be called at least once Actual: never called - unsatisfied and active unknown file: Failure C++ exception with description ":- postPortRemove: expected rid oid:0x0 to be present in RIDTOVID" thrown in the test body. [Thread 0x7ffff6cc86c0 (LWP 29131) exited] [ FAILED ] SyncdTest.BulkRemoveNotImplemented (34118 ms) Yes, maybe this error not related with m_rid2vid, as I later updated it, seems no impact.

is taht error locally only ? since on pipeline all tests pass ok

Yes, that is local only, seems caused by that port is already removed, but mock still return valid, will add some condition to tell mock that port is already deleted.
Currently local callstack is:
(gdb) bt
#0 0x00007ffff75f40a1 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#1 0x00007ffff7e8ca86 in ?? () from /lib/x86_64-linux-gnu/libswsscommon.so.0
#2 0x00005555556f546a in syncd::VirtualOidTranslator::translateVidToRid (this=0x5555558f1e10, vid=) at /usr/include/c++/12/bits/basic_string.h:233
#3 0x00005555556cc5b1 in syncd::Syncd::processBulkOidRemove (this=this@entry=0x5555558dd0d0, objectType=objectType@entry=SAI_OBJECT_TYPE_PORT, mode=mode@entry=SAI_BULK_OP_ERROR_MODE_IGNORE_ERROR,
objectIds=std::vector of length 1, capacity 1 = {...}, statuses=std::vector of length 1, capacity 1 = {...}) at /usr/include/c++/12/bits/shared_ptr_base.h:1665
#4 0x00005555556d864a in syncd::Syncd::processBulkOid (this=this@entry=0x5555558dd0d0, objectType=SAI_OBJECT_TYPE_PORT, objectIds=std::vector of length 1, capacity 1 = {...}, api=api@entry=SAI_COMMON_API_BULK_REMOVE,
attributes=std::vector of length 1, capacity 1 = {...}, strAttributes=std::vector of length 1, capacity 1 = {...}) at ./syncd/Syncd.cpp:2217
#5 0x00005555556d92bc in syncd::Syncd::processBulkQuadEvent (this=this@entry=0x5555558dd0d0, api=api@entry=SAI_COMMON_API_BULK_REMOVE, kco=std::tuple containing = {...}) at ./syncd/Syncd.cpp:901
#6 0x00005555556dffbb in syncd::Syncd::processSingleEvent (this=this@entry=0x5555558dd0d0, kco=std::tuple containing = {...}) at ./syncd/Syncd.cpp:382
#7 0x00005555556e0363 in syncd::Syncd::processEvent (this=0x5555558dd0d0, consumer=...) at ./syncd/Syncd.cpp:342
#8 0x00005555555fe180 in SyncdTest_BulkRemoveNotImplemented_Test::TestBody (this=0x5555558dc560) at /usr/include/c++/12/bits/shared_ptr_base.h:1665
#9 0x0000555555654177 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::)(), char const) ()

@kcudnik
Copy link
Collaborator

kcudnik commented Feb 14, 2025

This is not my code since I did not added mock methods, but mock can return independent values than actual function, c
So maybe entire test case is wrong

Log level should be warning instead of error, because functionality is
fine. Latter code will fall back to use single api instead of bulk api,
when syncd support bulk but sai doesn't support it.

Signed-off-by: Jianyue Wu <jianyuew@nvidia.com>
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jianyuewu
Copy link
Contributor Author

This is not my code since I did not added mock methods, but mock can return independent values than actual function, c So maybe entire test case is wrong

Yes, I tried to rewrite test case, now it is fine, thanks for the hint:)

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jianyuewu
Copy link
Contributor Author

Now coverage passed:)
Missing: 0 lines
Coverage: 100%
Threshold: 80%

@@ -296,4 +321,323 @@
}));
syncd_object.processEvent(consumer);
}

TEST_F(SyncdTest, BulkCreateTest)

Check warning

Code scanning / CodeQL

Poorly documented large function Warning

Poorly documented function: fewer than 2% comments for a function of 137 lines.
@kcudnik
Copy link
Collaborator

kcudnik commented Feb 14, 2025

Now coverage passed:) Missing: 0 lines Coverage: 100% Threshold: 80%

It's ironic since changing 1 line of code and adding 10 of unit tests will pass code coverage since unit tests are also included on code coverage 🤣 and the test don't need to do actual testing 😂

@kcudnik kcudnik merged commit f464a1e into sonic-net:master Feb 14, 2025
15 checks passed
@jianyuewu
Copy link
Contributor Author

jianyuewu commented Feb 17, 2025

Now coverage passed:) Missing: 0 lines Coverage: 100% Threshold: 80%

It's ironic since changing 1 line of code and adding 10 of unit tests will pass code coverage since unit tests are also included on code coverage 🤣 and the test don't need to do actual testing 😂

Ah, yes, that is really tricky😂 BTW, this time I confirmed with GDB, and those logs are indeed covered.

@jianyuewu jianyuewu deleted the bulkset_loglevel branch February 17, 2025 05:26
@dprital
Copy link
Collaborator

dprital commented Feb 18, 2025

@jianyuewu , Please provide direct PR to 202411 as this one got conflicts.

@jianyuewu
Copy link
Contributor Author

jianyuewu commented Feb 18, 2025

@jianyuewu , Please provide direct PR to 202411 as this one got conflicts.

@dprital Thanks for remind, backport PR to 202411 created: #1532

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants